Re: [PATCH 1/3] gimple-fold: Transform stp*cpy_chk to str*cpy directly

2021-11-13 Thread Siddhesh Poyarekar

On 11/12/21 22:46, Prathamesh Kulkarni wrote:

On Fri, 12 Nov 2021 at 01:12, Siddhesh Poyarekar  wrote:


Avoid going through another folding cycle and use the ignore flag to
directly transform BUILT_IN_STPCPY_CHK to BUILT_IN_STRCPY when set,
likewise for BUILT_IN_STPNCPY_CHK to BUILT_IN_STPNCPY.

Dump the transformation in dump_file so that we can verify in tests that
the direct transformation actually happened.

gcc/ChangeLog:

 * gimple-fold.c (gimple_fold_builtin_stxcpy_chk,
 gimple_fold_builtin_stxncpy_chk): Use BUILT_IN_STRNCPY if return
 value is not used.

gcc/testsuite/ChangeLog:

 * gcc.dg/fold-stringops.c: New test.

Signed-off-by: Siddhesh Poyarekar 
---
  gcc/gimple-fold.c   | 50 +
  gcc/testsuite/gcc.dg/fold-stringops-1.c | 23 
  2 files changed, 57 insertions(+), 16 deletions(-)
  create mode 100644 gcc/testsuite/gcc.dg/fold-stringops-1.c

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 6e25a7c05db..92e15784803 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -3088,6 +3088,19 @@ gimple_fold_builtin_memory_chk (gimple_stmt_iterator 
*gsi,
return true;
  }

+static void
+dump_transformation (gimple *from, gimple *to)

I assume that both from and to will always be builtin calls ?
In that case, perhaps better to use gcall * here (and in rest of patch).
Also, needs a top-level comment describing the function.

+{
+  if (dump_file && (dump_flags & TDF_DETAILS))

Perhaps better to use dump_enabled_p ?

+{
+  fprintf (dump_file, "transformed ");

Perhaps use dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, ...) ?
I think you can use gimple_location to get the location.



Thanks, I'll fix these up.

Siddhesh


[r12-5236 Regression] FAIL: gcc.dg/tree-prof/merge_block.c scan-tree-dump-not optimized "Invalid sum" on Linux/x86_64

2021-11-13 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

5aa91072e24c1e16a5ec641b48b64c9c9f199f13 is the first bad commit
commit 5aa91072e24c1e16a5ec641b48b64c9c9f199f13
Author: Jan Hubicka 
Date:   Sat Nov 13 22:25:23 2021 +0100

Implement DSE of dead functions calls storing memory.

caused

FAIL: c-c++-common/tsan/free_race2.c   -O2  execution test
FAIL: c-c++-common/tsan/free_race.c   -O2  execution test
FAIL: gcc.dg/ipa/ipa-sra-4.c scan-ipa-dump-times sra "Will split parameter" 2
FAIL: gcc.dg/tree-prof/merge_block.c scan-tree-dump-not optimized "Invalid sum"

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-5236/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="tsan.exp=c-c++-common/tsan/free_race2.c 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="tsan.exp=c-c++-common/tsan/free_race2.c 
--target_board='unix{-m64\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="tsan.exp=c-c++-common/tsan/free_race.c 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="tsan.exp=c-c++-common/tsan/free_race.c --target_board='unix{-m64\ 
-march=cascadelake}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="ipa.exp=gcc.dg/ipa/ipa-sra-4.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="ipa.exp=gcc.dg/ipa/ipa-sra-4.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="ipa.exp=gcc.dg/ipa/ipa-sra-4.c 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="ipa.exp=gcc.dg/ipa/ipa-sra-4.c 
--target_board='unix{-m64\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="tree-prof.exp=gcc.dg/tree-prof/merge_block.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="tree-prof.exp=gcc.dg/tree-prof/merge_block.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="tree-prof.exp=gcc.dg/tree-prof/merge_block.c 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="tree-prof.exp=gcc.dg/tree-prof/merge_block.c 
--target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


Re: [PATCH 0/6] RFC: adding support to GCC for detecting trust boundaries

2021-11-13 Thread David Malcolm via Gcc-patches
On Sun, 2021-11-14 at 00:20 +0100, Peter Zijlstra wrote:
> On Sat, Nov 13, 2021 at 03:37:24PM -0500, David Malcolm wrote:
> 
> > This approach is much less expressive that the custom addres space
> > approach; it would only cover the trust boundary aspect; it
> > wouldn't
> > cover any differences between generic pointers and __user, vs
> > __iomem,
> > __percpu, and __rcu which I admit I only dimly understand.
> 
> __iomem would point at device memory, which can have curious side
> effects or is yet another trust boundary, depending on device and
> usage.
> 
> __percpu is an address space that denotes a per-cpu variable's
> relative
> offset, it needs be combined with a per-cpu offset to get a 'real'
> pointer, on x86_64 %gs segment offset is used for this purpose, other
> architectures are less fortunate. The whole per_cpu()/this_cpu_*()
> family of APIs accepts such pointers.
> 
> __rcu is the regular kernel address space, but denotes that the
> object
> pointed to has RCU lifetime management. The attribute is laundered
> through rcu_dereference() to remove the __rcu qualifier.

Thanks; this is very helpful.

> 
> > Possibly silly question: is it always a bug for the value of a
> > kernel
> > pointer to leak into user space?  i.e. should I be complaining
> > about an
> > infoleak if the value of a trusted_ptr itself is written to
> > *untrusted_ptr?  e.g.
> 
> Yes, always. Leaking kernel pointers is unconditionally bad.

Thanks.

FWIW I've thrown together a new warning in -fanalyzer for this, e.g.
given:

/* Some kernel space thing, where the address is presumably secret */
struct foo_t
{
} foo;

/* Response struct for some ioctl/syscall  */
struct s1
{
  void *ptr;
};

void test_1 (void __user *p)
{
  struct s1 s = {0};
  s.ptr = 
  copy_to_user (p, , sizeof (s));
}

...my code emits...

infoleak-ptr-1.c: In function ‘test_1’:
infoleak-ptr-1.c:17:3: warning: potential exposure of sensitive
  information by copying pointer ‘’ across trust boundary
  [-Wanalyzer-exposure-of-pointer]
   17 |   copy_to_user (p, , sizeof (s));
  |   ^~~~

but it strikes me that there could be other sensitive information
beyond just the values of kernel-space pointers that must not cross a
trust boundary.  GCC's -fanalyzer currently has a state machine for
tracking "sensitive" values, but it's currently just a proof-of-concept
that merely treats the result of the user-space API "getpass" as
sensitive (with a demo of detecting passwords being exposed via
logfiles).  Any ideas on other values in the kernel that it would be
useful to treat as "sensitive"?  (maybe crypto private keys???  other
internal state???)  I can do it by types, by results of functions, etc.
That said, I'm not modeling the kernel's own access model (root vs
regular user etc) in the analyzer, so maybe extending things beyond
kernel space addresses is misguided?


Hope this is constructive
Dave



[PATCH] PR libgomp/103068: Optimize gomp_mutex_lock_slow for x86 target

2021-11-13 Thread Hongyu Wang via Gcc-patches
Hi, 

>From the CPU's point of view, getting a cache line for writing is more
expensive than reading.  See Appendix A.2 Spinlock in:

https://www.intel.com/content/dam/www/public/us/en/documents/white-papers
/xeon-lock-scaling-analysis-paper.pdf

The full compare and swap will grab the cache line exclusive and causes
excessive cache line bouncing.

For gomp_mutex_lock_slow, it spins on __atomic_compare_exchange_n, so
add load-check to continue spin if cmpxchg may fail.

Bootstrapped/regtested on x86_64-pc-linux-gnu{-m32,}.
Ok for master?

libgomp/ChangeLog:

PR libgomp/103068
* config/linux/mutex.c (gomp_mutex_lock_slow): Continue spin
loop when mutex is not 0 under x86 target.
* config/linux/x86/futex.h (TARGET_X86_AVOID_CMPXCHG): Define.
---
 libgomp/config/linux/mutex.c | 5 +
 libgomp/config/linux/x86/futex.h | 2 ++
 2 files changed, 7 insertions(+)

diff --git a/libgomp/config/linux/mutex.c b/libgomp/config/linux/mutex.c
index 838264dc1f9..4e87566eb2b 100644
--- a/libgomp/config/linux/mutex.c
+++ b/libgomp/config/linux/mutex.c
@@ -49,6 +49,11 @@ gomp_mutex_lock_slow (gomp_mutex_t *mutex, int oldval)
}
   else
{
+#ifdef TARGET_X86_AVOID_CMPXCHG
+ /* For x86, omit cmpxchg when atomic load shows mutex is not 0.  */
+ if ((oldval = __atomic_load_n (mutex, MEMMODEL_RELAXED)) != 0)
+   continue;
+#endif
  /* Something changed.  If now unlocked, we're good to go.  */
  oldval = 0;
  if (__atomic_compare_exchange_n (mutex, , 1, false,
diff --git a/libgomp/config/linux/x86/futex.h b/libgomp/config/linux/x86/futex.h
index e7f53399a4e..acc1d1467d7 100644
--- a/libgomp/config/linux/x86/futex.h
+++ b/libgomp/config/linux/x86/futex.h
@@ -122,3 +122,5 @@ cpu_relax (void)
 {
   __builtin_ia32_pause ();
 }
+
+#define TARGET_X86_AVOID_CMPXCHG
-- 
2.18.1



Fix crash in gamess

2021-11-13 Thread Jan Hubicka via Gcc-patches
Hi,
this patch adds debug counters for pure/const discover and fixes
somewhat embarrasing pasto I made while breaking out ipa_make_function_*
helpers out of propagate_pure_const which led to wrong function being
marked as pure that in turn leads to wrong code.
My apologizes for that.

Bootstrapped/regtested x86_64-linux, comitted.

gcc/ChangeLog:

PR lto/103211
* dbgcnt.def (ipa_attr): New counters.
* ipa-pure-const.c: Include dbgcnt.c
(ipa_make_function_const): Use debug counter.
(ipa_make_function_pure): Likewise.
(propagate_pure_const): Fix bug in my previous change.

diff --git a/gcc/dbgcnt.def b/gcc/dbgcnt.def
index 3a85665a1d7..f8a15f3d1d1 100644
--- a/gcc/dbgcnt.def
+++ b/gcc/dbgcnt.def
@@ -175,6 +175,7 @@ DEBUG_COUNTER (if_after_reload)
 DEBUG_COUNTER (if_conversion)
 DEBUG_COUNTER (if_conversion_tree)
 DEBUG_COUNTER (if_to_switch)
+DEBUG_COUNTER (ipa_attr)
 DEBUG_COUNTER (ipa_cp_bits)
 DEBUG_COUNTER (ipa_cp_values)
 DEBUG_COUNTER (ipa_cp_vr)
diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c
index 5056850c0a8..a332940b55d 100644
--- a/gcc/ipa-pure-const.c
+++ b/gcc/ipa-pure-const.c
@@ -62,6 +62,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "ipa-prop.h"
 #include "ipa-fnsummary.h"
 #include "symtab-thunks.h"
+#include "dbgcnt.h"
 
 /* Lattice values for const and pure functions.  Everything starts out
being const, then may drop to pure and then neither depending on
@@ -1476,8 +1477,10 @@ ipa_make_function_const (struct cgraph_node *node, bool 
looping, bool local)
 fprintf (dump_file, "Function found to be %sconst: %s\n",
 looping ? "looping " : "",
 node->dump_name ());
-  if (!local)
+  if (!local && !looping)
 cdtor = node->call_for_symbol_and_aliases (cdtor_p, NULL, true);
+  if (!dbg_cnt (ipa_attr))
+return false;
   if (node->set_const_flag (true, looping))
 {
   if (dump_file)
@@ -1511,8 +1514,10 @@ ipa_make_function_pure (struct cgraph_node *node, bool 
looping, bool local)
 fprintf (dump_file, "Function found to be %spure: %s\n",
 looping ? "looping " : "",
 node->dump_name ());
-  if (!local)
+  if (!local && !looping)
 cdtor = node->call_for_symbol_and_aliases (cdtor_p, NULL, true);
+  if (!dbg_cnt (ipa_attr))
+return false;
   if (node->set_pure_flag (true, looping))
 {
   if (dump_file)
@@ -1797,11 +1802,11 @@ propagate_pure_const (void)
switch (this_state)
  {
  case IPA_CONST:
-   remove_p |= ipa_make_function_const (node, this_looping, false);
+   remove_p |= ipa_make_function_const (w, this_looping, false);
break;
 
  case IPA_PURE:
-   remove_p |= ipa_make_function_pure (node, this_looping, false);
+   remove_p |= ipa_make_function_pure (w, this_looping, false);
break;
 
  default:


Re: [PATCH 0/6] RFC: adding support to GCC for detecting trust boundaries

2021-11-13 Thread Peter Zijlstra
On Sat, Nov 13, 2021 at 03:37:24PM -0500, David Malcolm wrote:

> This approach is much less expressive that the custom addres space
> approach; it would only cover the trust boundary aspect; it wouldn't
> cover any differences between generic pointers and __user, vs __iomem,
> __percpu, and __rcu which I admit I only dimly understand.

__iomem would point at device memory, which can have curious side
effects or is yet another trust boundary, depending on device and usage.

__percpu is an address space that denotes a per-cpu variable's relative
offset, it needs be combined with a per-cpu offset to get a 'real'
pointer, on x86_64 %gs segment offset is used for this purpose, other
architectures are less fortunate. The whole per_cpu()/this_cpu_*()
family of APIs accepts such pointers.

__rcu is the regular kernel address space, but denotes that the object
pointed to has RCU lifetime management. The attribute is laundered
through rcu_dereference() to remove the __rcu qualifier.

> Possibly silly question: is it always a bug for the value of a kernel
> pointer to leak into user space?  i.e. should I be complaining about an
> infoleak if the value of a trusted_ptr itself is written to
> *untrusted_ptr?  e.g.

Yes, always. Leaking kernel pointers is unconditionally bad.


[PATCH 3/6] analyzer: implement infoleak detection

2021-11-13 Thread David Malcolm via Gcc-patches
This patch adds a new -Wanalyzer-exposure-through-uninit-copy, emitted
by -fanalyzer if it detects copying of uninitialized data through
a pointer to an untrusted region.

The patch uses region::untrusted_p in the analyzer and __user in the
testsuite to identify untrusted regions, but the implementation of this
is left to follow-up patches, so that they can be either
via the custom_address_space pragma, or via __attribute__((untrusted).

The diagnostic uses notes to express what fields and padding within
a struct have not been initialized.  For example:

infoleak-CVE-2011-1078-2.c: In function ‘test_1’:
infoleak-CVE-2011-1078-2.c:28:9: warning: potential exposure of sensitive 
information by copying uninitialized data from stack across trust boundary 
[CWE-200] [-Wanalyzer-exposure-through-uninit-copy]
   28 | copy_to_user(optval, , sizeof(cinfo));
  | ^~~
  ‘test_1’: events 1-3
|
|   21 | struct sco_conninfo cinfo;
|  | ^
|  | |
|  | (1) region created on stack here
|  | (2) capacity: 6 bytes
|..
|   28 | copy_to_user(optval, , sizeof(cinfo));
|  | ~~~
|  | |
|  | (3) uninitialized data copied from stack here
|
infoleak-CVE-2011-1078-2.c:28:9: note: 1 byte is uninitialized
   28 | copy_to_user(optval, , sizeof(cinfo));
  | ^~~
infoleak-CVE-2011-1078-2.c:14:15: note: padding after field ‘dev_class’ is 
uninitialized (1 byte)
   14 | __u8  dev_class[3];
  |   ^
infoleak-CVE-2011-1078-2.c:21:29: note: suggest forcing zero-initialization by 
providing a ‘{0}’ initializer
   21 | struct sco_conninfo cinfo;
  | ^
  |   = {0}

gcc/ChangeLog:
* Makefile.in (ANALYZER_OBJS): Add analyzer/trust-boundaries.o.
* doc/invoke.texi (-Wanalyzer-exposure-through-uninit-copy): New.

gcc/analyzer/ChangeLog:
* analyzer.opt (Wanalyzer-exposure-through-uninit-copy): New.
* checker-path.cc (event_kind_to_string): Handle
EK_REGION_CREATION.
(region_creation_event::region_creation_event): New.
(region_creation_event::get_desc): New.
(checker_path::add_region_creation_events): New.
* checker-path.h (enum event_kind): Add EK_REGION_CREATION.
(enum rce_kind): New.
(class region_creation_event): New.
(checker_path::add_region_creation_events): New.
* diagnostic-manager.cc
(diagnostic_manager::emit_saved_diagnostic): Pass NULL to
add_events_for_eedge.
(diagnostic_manager::build_emission_path): Create interesting_t
instance and pass it to mark_interesting_stuff, and to
add_events_for_eedge.
(diagnostic_manager::add_events_for_eedge): Add "interest" param.
Use it to create region_creation_events for on-stack regions when
a stack frame is pushed and for heap-based and alloca regions.
(diagnostic_manager::prune_for_sm_diagnostic): Handle
EK_REGION_CREATION.
* diagnostic-manager.h (diagnostic_manager::add_events_for_eedge):
Add "interest" param.
* pending-diagnostic.cc: Include "selftest.h", "tristate.h",
"analyzer/call-string.h", "analyzer/program-point.h",
"analyzer/store.h", and "analyzer/region-model.h".
(interesting_t::add_region_creation): New.
(interesting_t::dump_to_pp): New.
* pending-diagnostic.h (struct interesting_t): New.
(pending_diagnostic::mark_interesting_stuff): New.
* region-model-impl-calls.cc (call_details::get_logger): New.
* region-model.cc: Include "analyzer/call-info.h".
(enum return_meaning): New.
(get_return_meaning): New.
(region_model::update_for_zero_return): New.
(region_model::update_for_nonzero_return): New.
(class maybe_returns_zero_call_info): New.
(class copy_success): New.
(class copy_failure): New.
(maybe_simplify_upper_bound): New.
(region_model::maybe_get_copy_bounds): New.
(struct copy_fn_details): New.
(is_copy_function): New.
(region_model::handle_copy_function): New.
(region_model::on_call_pre): Call is_copy_function and
handle_copy_function.
(region_model::set_value): Add param "src_reg".  Call
maybe_complain_about_infoleak for copies to untrusted regions.
* region-model.h (call_details::get_logger): New.
(struct copy_fn_details): New forward decl.
(region_model::handle_copy_function): New.
(region_model::maybe_get_copy_bounds): New.

[PATCH 6/6] Add __attribute__ ((tainted))

2021-11-13 Thread David Malcolm via Gcc-patches
This patch adds a new __attribute__ ((tainted)) to the C/C++ frontends.

It can be used on function decls: the analyzer will treat as tainted
all parameters to the function and all buffers pointed to by parameters
to the function.  Adding this in one place to the Linux kernel's
__SYSCALL_DEFINEx macro allows the analyzer to treat all syscalls as
having tainted inputs.  This gives additional testing beyond e.g. __user
pointers added by earlier patches - an example of the use of this can be
seen in CVE-2011-2210, where given:

 SYSCALL_DEFINE5(osf_getsysinfo, unsigned long, op, void __user *, buffer,
 unsigned long, nbytes, int __user *, start, void __user *, arg)

the analyzer will treat the nbytes param as under attacker control, and
can complain accordingly:

taint-CVE-2011-2210-1.c: In function ‘sys_osf_getsysinfo’:
taint-CVE-2011-2210-1.c:69:21: warning: use of attacker-controlled value
  ‘nbytes’ as size without upper-bounds checking [CWE-129] 
[-Wanalyzer-tainted-size]
   69 | if (copy_to_user(buffer, hwrpb, nbytes) != 0)
  | ^~~

Additionally, the patch allows the attribute to be used on field decls:
specifically function pointers.  Any function used as an initializer
for such a field gets treated as tainted.  An example can be seen in
CVE-2020-13143, where adding __attribute__((tainted)) to the "store"
callback of configfs_attribute:

  struct configfs_attribute {
 /* [...snip...] */
 ssize_t (*store)(struct config_item *, const char *, size_t)
   __attribute__((tainted));
 /* [...snip...] */
  };

allows the analyzer to see:

 CONFIGFS_ATTR(gadget_dev_desc_, UDC);

and treat gadget_dev_desc_UDC_store as tainted, so that it complains:

taint-CVE-2020-13143-1.c: In function ‘gadget_dev_desc_UDC_store’:
taint-CVE-2020-13143-1.c:33:17: warning: use of attacker-controlled value
  ‘len + 18446744073709551615’ as offset without upper-bounds checking 
[CWE-823] [-Wanalyzer-tainted-offset]
   33 | if (name[len - 1] == '\n')
  | ^

Similarly, the attribute could be used on the ioctl callback field,
USB device callbacks, network-handling callbacks etc.  This potentially
gives a lot of test coverage with relatively little code annotation, and
without necessarily needing link-time analysis (which -fanalyzer can
only do at present on trivial examples).

I believe this is the first time we've had an attribute on a field.
If that's an issue, I could prepare a version of the patch that
merely allowed it on functions themselves.

As before this currently still needs -fanalyzer-checker=taint (in
addition to -fanalyzer).

gcc/analyzer/ChangeLog:
* engine.cc: Include "stringpool.h", "attribs.h", and
"tree-dfa.h".
(mark_params_as_tainted): New.
(class tainted_function_custom_event): New.
(class tainted_function_info): New.
(exploded_graph::add_function_entry): Handle functions with
"tainted" attribute.
(class tainted_field_custom_event): New.
(class tainted_callback_custom_event): New.
(class tainted_call_info): New.
(add_tainted_callback): New.
(add_any_callbacks): New.
(exploded_graph::build_initial_worklist): Find callbacks that are
reachable from global initializers, calling add_any_callbacks on
them.

gcc/c-family/ChangeLog:
* c-attribs.c (c_common_attribute_table): Add "tainted".
(handle_tainted_attribute): New.

gcc/ChangeLog:
* doc/extend.texi (Function Attributes): Note that "tainted" can
be used on field decls.
(Common Function Attributes): Add entry on "tainted" attribute.

gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/attr-tainted-1.c: New test.
* gcc.dg/analyzer/attr-tainted-misuses.c: New test.
* gcc.dg/analyzer/taint-CVE-2011-2210-1.c: New test.
* gcc.dg/analyzer/taint-CVE-2020-13143-1.c: New test.
* gcc.dg/analyzer/taint-CVE-2020-13143-2.c: New test.
* gcc.dg/analyzer/taint-CVE-2020-13143.h: New test.
* gcc.dg/analyzer/taint-alloc-3.c: New test.
* gcc.dg/analyzer/taint-alloc-4.c: New test.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/engine.cc| 317 +-
 gcc/c-family/c-attribs.c  |  36 ++
 gcc/doc/extend.texi   |  22 +-
 .../gcc.dg/analyzer/attr-tainted-1.c  |  88 +
 .../gcc.dg/analyzer/attr-tainted-misuses.c|   6 +
 .../gcc.dg/analyzer/taint-CVE-2011-2210-1.c   |  93 +
 .../gcc.dg/analyzer/taint-CVE-2020-13143-1.c  |  38 +++
 .../gcc.dg/analyzer/taint-CVE-2020-13143-2.c  |  32 ++
 .../gcc.dg/analyzer/taint-CVE-2020-13143.h|  91 +
 gcc/testsuite/gcc.dg/analyzer/taint-alloc-3.c |  21 ++
 gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c |  31 ++
 11 files changed, 772 insertions(+), 3 deletions(-)
 create mode 100644 

[PATCH 1a/6] RFC: Implement "#pragma GCC custom_address_space"

2021-11-13 Thread David Malcolm via Gcc-patches
This work-in-progress patch adds a new:

  #prgama GCC custom_address_space(NAME_OF_ADDRESS_SPACE)

for use by the C front-end.

Currently the custom address spaces are:

- disjoint from all other address spaces, *including* the generic one

- treated the same as the generic address space at the RTL level (in
  terms of code generation)

- treated as "untrusted" by -fanalyzer in a follow-up patch.

but additional syntax could be added to change those defaults if
needed.

The intended use for this is in Linux kernel code, allowing e.g.:

  #define __kernel
  #pragma GCC custom_address_space(__user)
  #pragma GCC custom_address_space(__iomem)
  #pragma GCC custom_address_space(__percpu)
  #pragma GCC custom_address_space(__rcu)

so that the C front-end can complain about mismatching user-space vs
kernel-space pointers during type-checking (and that -fanalyzer can
detect infoleaks and "taint" as data is copied across trust boundaries).

Known issues:
- addr_space_convert is not implemented.
- there isn't yet a way to forcibly cast between address spaces,
  perhaps this should be a built-in function.
- only tested so far on x86_64 (probably needs to use
  ensure_builtin_addr_space everywhere in the target-specific code that
  tests against specific address space IDs).
- issue in testsuite (custom-address-space-2.c)
- issue with precompiled headers

gcc/ChangeLog:
* Makefile.in (OBJS): Add addr-space.o.
(GTFILES): Add addr-space.cc.
* addr-space.cc: New file.
* addr-space.h: New file.
* auto-inc-dec.c: Include "addr-space.h".
(find_inc): Convert targetm.addr_space. uses into addr_space_
calls.
* builtins.c: Include "addr-space.h".
(get_builtin_sync_mem): Convert targetm.addr_space. use into
addr_space_ call.
* cfgexpand.c: Include "addr-space.h".
(convert_debug_memory_address): Convert targetm.addr_space. uses
into addr_space_ calls.
(expand_debug_expr): Likewise.
* config/i386/i386.c: Include "addr-space.h".
(ix86_print_operand_address_as): Call ensure_builtin_addr_space.
* coretypes.h (ADDR_SPACE_T_MAX): New.
(struct custom_addr_space): New forward decl.
* doc/extend.texi (Named Address Spaces): Mention the new pragma.
(Custom Address Space Pragmas): New node and subsection.
* dwarf2out.c: Include "addr-space.h".
(modified_type_die): Convert targetm.addr_space. use into
addr_space_ call.
* emit-rtl.c: Include "addr-space.h".
(adjust_address_1): Convert targetm.addr_space. use into
addr_space_ call.
* explow.c: Include "addr-space.h".
(convert_memory_address_addr_space_1): Convert targetm.addr_space.
use into addr_space_ call.
(memory_address_addr_space): Likewise.
(promote_mode): Likewise.
* expr.c: Include "addr-space.h".
(store_expr): Convert targetm.addr_space. use into addr_space_ call.
(expand_expr_addr_expr): Likewise.
(expand_expr_real_2): Likewise.
(expand_expr_real_1): Likewise.
* fold-const.c: Include "addr-space.h".
(const_unop): Convert targetm.addr_space. use into addr_space_ call.
* gimple.c: Include "addr-space.h".
(check_loadstore): Convert targetm.addr_space. use into
addr_space_ call.
* lra-constraints.c: Include "addr-space.h".
(valid_address_p): Convert targetm.addr_space. use into
addr_space_ call.
* pointer-query.cc: Include "addr-space.h"; drop include of
"target.h".
(compute_objsize_r): Convert targetm.addr_space. use into
addr_space_ call.
* recog.c: Include "addr-space.h".
(memory_address_addr_space_p): Convert targetm.addr_space. use
into addr_space_ call.
(offsettable_address_addr_space_p): Likewise.
* reload.c: Include "addr-space.h".
(strict_memory_address_addr_space_p): Convert targetm.addr_space.
use into addr_space_ call.
(find_reloads_address): Likewise.
* rtlanal.c: Include "addr-space.h".
(get_address_mode): Convert targetm.addr_space. use into
addr_space_ call.
* tree-ssa-address.c: Include "addr-space.h".
(addr_for_mem_ref): Convert targetm.addr_space. use into
addr_space_ call.
(multiplier_allowed_in_address_p): Likewise.
(most_expensive_mult_to_index): Likewise.
* tree-ssa-loop-ivopts.c: Include "addr-space.h".
(addr_offset_valid_p): Convert targetm.addr_space. use into
addr_space_ call.
(produce_memory_decl_rtl): Likewise.
* tree.c: Include "addr-space.h".
(build_pointer_type_for_mode): Convert targetm.addr_space. use
into addr_space_ call.
(build_reference_type_for_mode): Likewise.
* varasm.c: Include "addr-space.h".
(make_decl_rtl): Convert targetm.addr_space. use into addr_space_
 

[PATCH 5/6] analyzer: use region::untrusted_p in taint detection

2021-11-13 Thread David Malcolm via Gcc-patches
This patch wires up the "untrusted" region logic to the analyzer's taint
detection, so that any data copied via a __user pointer (e.g. via a
suitably annotated "copy_from_user" decl) is treated as tainted.

It includes a series of reproducers for detecting CVE-2011-0521.
Unfortunately the analyzer doesn't yet detect the issue until the
code has been significantly simplified from its original form:
currently only in -5.c and -6.c in the series of tests (see notes
in the individual cases).

gcc/analyzer/ChangeLog:
* sm-taint.cc (taint_state_machine::get_default_state): New, using
region::untrusted_p.

gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/taint-CVE-2011-0521-1-fixed.c: New test.
* gcc.dg/analyzer/taint-CVE-2011-0521-1.c: New test.
* gcc.dg/analyzer/taint-CVE-2011-0521-2-fixed.c: New test.
* gcc.dg/analyzer/taint-CVE-2011-0521-2.c: New test.
* gcc.dg/analyzer/taint-CVE-2011-0521-3-fixed.c: New test.
* gcc.dg/analyzer/taint-CVE-2011-0521-3.c: New test.
* gcc.dg/analyzer/taint-CVE-2011-0521-4.c: New test.
* gcc.dg/analyzer/taint-CVE-2011-0521-5.c: New test.
* gcc.dg/analyzer/taint-CVE-2011-0521-6.c: New test.
* gcc.dg/analyzer/taint-CVE-2011-0521.h: New test.
* gcc.dg/analyzer/taint-antipatterns-1.c: New test.
* gcc.dg/analyzer/taint-read-through-untrusted-ptr-1.c: New test.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/sm-taint.cc  |  13 ++
 .../analyzer/taint-CVE-2011-0521-1-fixed.c| 113 +++
 .../gcc.dg/analyzer/taint-CVE-2011-0521-1.c   | 113 +++
 .../analyzer/taint-CVE-2011-0521-2-fixed.c|  93 
 .../gcc.dg/analyzer/taint-CVE-2011-0521-2.c   |  93 
 .../analyzer/taint-CVE-2011-0521-3-fixed.c|  56 +++
 .../gcc.dg/analyzer/taint-CVE-2011-0521-3.c   |  57 
 .../gcc.dg/analyzer/taint-CVE-2011-0521-4.c   |  40 +
 .../gcc.dg/analyzer/taint-CVE-2011-0521-5.c   |  42 ++
 .../gcc.dg/analyzer/taint-CVE-2011-0521-6.c   |  37 +
 .../gcc.dg/analyzer/taint-CVE-2011-0521.h | 136 +
 .../gcc.dg/analyzer/taint-antipatterns-1.c| 137 ++
 .../taint-read-through-untrusted-ptr-1.c  |  37 +
 13 files changed, 967 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-0521-1-fixed.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-0521-1.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-0521-2-fixed.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-0521-2.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-0521-3-fixed.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-0521-3.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-0521-4.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-0521-5.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-0521-6.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-0521.h
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-antipatterns-1.c
 create mode 100644 
gcc/testsuite/gcc.dg/analyzer/taint-read-through-untrusted-ptr-1.c

diff --git a/gcc/analyzer/sm-taint.cc b/gcc/analyzer/sm-taint.cc
index 0a51a1fe2ea..53ba6f2b30c 100644
--- a/gcc/analyzer/sm-taint.cc
+++ b/gcc/analyzer/sm-taint.cc
@@ -85,6 +85,19 @@ public:
   const extrinsic_state _state)
 const FINAL OVERRIDE;
 
+  state_machine::state_t
+  get_default_state (const svalue *sval) const FINAL OVERRIDE
+  {
+/* Default to "tainted" when reading through a pointer to an untrusted
+   region.  */
+if (const initial_svalue *initial_sval = sval->dyn_cast_initial_svalue ())
+  {
+   if (initial_sval->get_region ()->untrusted_p ())
+ return m_tainted;
+  }
+return m_start;
+  }
+
   bool on_stmt (sm_context *sm_ctxt,
const supernode *node,
const gimple *stmt) const FINAL OVERRIDE;
diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-0521-1-fixed.c 
b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-0521-1-fixed.c
new file mode 100644
index 000..a97896f2266
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-0521-1-fixed.c
@@ -0,0 +1,113 @@
+/* See notes in this header.  */
+#include "taint-CVE-2011-0521.h"
+
+// TODO: remove need for this option
+/* { dg-additional-options "-fanalyzer-checker=taint" } */
+
+/* Adapted from drivers/media/dvb/ttpci/av7110_ca.c  */
+
+int dvb_ca_ioctl(struct file *file, unsigned int cmd, void *parg)
+{
+   struct dvb_device *dvbdev = file->private_data;
+   struct av7110 *av7110 = dvbdev->priv;
+   unsigned long arg = (unsigned long) parg;
+
+   /* case CA_GET_SLOT_INFO:  */
+   {
+   ca_slot_info_t *info=(ca_slot_info_t *)parg;
+
+   if (info->num < 0 || info->num > 1)
+   return -EINVAL;
+   

[PATCH 4b/6] analyzer: implement region::untrusted_p in terms of __attribute__((untrusted))

2021-11-13 Thread David Malcolm via Gcc-patches
gcc/analyzer/ChangeLog:
* region.cc (region::untrusted_p): Implement in terms of
__attribute__((untrusted)).

gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/test-uaccess.h: Change from custom_address_space
pragma to __attribute__((untrusted)).

Signed-off-by: David Malcolm 
---
 gcc/analyzer/region.cc   | 19 +++
 gcc/testsuite/gcc.dg/analyzer/test-uaccess.h |  2 +-
 2 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/gcc/analyzer/region.cc b/gcc/analyzer/region.cc
index b84504dbe42..52e9fa2d1e6 100644
--- a/gcc/analyzer/region.cc
+++ b/gcc/analyzer/region.cc
@@ -672,10 +672,21 @@ region::symbolic_for_unknown_ptr_p () const
 bool
 region::untrusted_p () const
 {
-  addr_space_t as = get_addr_space ();
-  /* FIXME: treat all non-generic address spaces as untrusted for now.  */
-  if (!ADDR_SPACE_GENERIC_P (as))
-return true;
+  const region *iter = this;
+  while (iter)
+{
+  if (iter->get_type ())
+   return TYPE_UNTRUSTED (iter->get_type ());
+  switch (iter->get_kind ())
+   {
+   default:
+ iter = iter->get_parent_region ();
+ continue;
+   case RK_CAST:
+ iter = iter->dyn_cast_cast_region ()->get_original_region ();
+ continue;
+   }
+}
   return false;
 }
 
diff --git a/gcc/testsuite/gcc.dg/analyzer/test-uaccess.h 
b/gcc/testsuite/gcc.dg/analyzer/test-uaccess.h
index 0500e20b22b..280f4045418 100644
--- a/gcc/testsuite/gcc.dg/analyzer/test-uaccess.h
+++ b/gcc/testsuite/gcc.dg/analyzer/test-uaccess.h
@@ -2,7 +2,7 @@
 
 /* Adapted from include/linux/compiler.h  */
 
-#pragma GCC custom_address_space(__user)
+#define __user __attribute__((untrusted))
 
 /* Adapted from include/asm-generic/uaccess.h  */
 
-- 
2.26.3



[PATCH 0/6] RFC: adding support to GCC for detecting trust boundaries

2021-11-13 Thread David Malcolm via Gcc-patches
[Crossposting between gcc-patches@gcc.gnu.org and
linux-toolcha...@vger.kernel.org; sorry about my lack of kernel
knowledge, in case of the following seems bogus]

I've been trying to turn my prototype from the LPC2021 session on
"Adding kernel-specific test coverage to GCC's -fanalyzer option"
( https://linuxplumbersconf.org/event/11/contributions/1076/ ) into
something that can go into GCC upstream without adding kernel-specific
special cases, or requiring a GCC plugin.  The prototype simply
specialcased "copy_from_user" and "copy_to_user" in GCC, which is
clearly not OK.

This GCC patch kit implements detection of "trust boundaries", aimed at
detection of "infoleaks" and of use of unsanitized attacker-controlled
values ("taint") in the Linux kernel.

For example, here's an infoleak diagnostic (using notes to
express what fields and padding within a struct have not been
initialized):

infoleak-CVE-2011-1078-2.c: In function ‘test_1’:
infoleak-CVE-2011-1078-2.c:28:9: warning: potential exposure of sensitive
  information by copying uninitialized data from stack across trust
  boundary [CWE-200] [-Wanalyzer-exposure-through-uninit-copy]
   28 | copy_to_user(optval, , sizeof(cinfo));
  | ^~~
  ‘test_1’: events 1-3
|
|   21 | struct sco_conninfo cinfo;
|  | ^
|  | |
|  | (1) region created on stack here
|  | (2) capacity: 6 bytes
|..
|   28 | copy_to_user(optval, , sizeof(cinfo));
|  | ~~~
|  | |
|  | (3) uninitialized data copied from stack here
|
infoleak-CVE-2011-1078-2.c:28:9: note: 1 byte is uninitialized
   28 | copy_to_user(optval, , sizeof(cinfo));
  | ^~~
infoleak-CVE-2011-1078-2.c:14:15: note: padding after field ‘dev_class’ is 
uninitialized (1 byte)
   14 | __u8  dev_class[3];
  |   ^
infoleak-CVE-2011-1078-2.c:21:29: note: suggest forcing zero-initialization by 
providing a ‘{0}’ initializer
   21 | struct sco_conninfo cinfo;
  | ^
  |   = {0}

I have to come up with a way of expressing trust boundaries in a way
that will be:
- acceptable to the GCC community (not be too kernel-specific), and
- useful to the Linux kernel community.

At LPC it was pointed out that the kernel already has various
annotations e.g. "__user" for different kinds of pointers, and that it
would be best to reuse those.


Approach 1: Custom Address Spaces
=

GCC's C frontend supports target-specific address spaces; see:
  https://gcc.gnu.org/onlinedocs/gcc/Named-Address-Spaces.html
Quoting the N1275 draft of ISO/IEC DTR 18037:
  "Address space names are ordinary identifiers, sharing the same name
  space as variables and typedef names.  Any such names follow the same
  rules for scope as other ordinary identifiers (such as typedef names).
  An implementation may provide an implementation-defined set of
  intrinsic address spaces that are, in effect, predefined at the start
  of every translation unit.  The names of intrinsic address spaces must
  be reserved identifiers (beginning with an underscore and an uppercase
  letter or with two underscores).  An implementation may also
  optionally support a means for new address space names to be defined
  within a translation unit."

Patch 1a in the following patch kit for GCC implements such a means to
define new address spaces names in a translation unit, via a pragma:
  #prgama GCC custom_address_space(NAME_OF_ADDRESS_SPACE)

For example, the Linux kernel could perhaps write:

  #define __kernel
  #pragma GCC custom_address_space(__user)
  #pragma GCC custom_address_space(__iomem)
  #pragma GCC custom_address_space(__percpu)
  #pragma GCC custom_address_space(__rcu)

and thus the C frontend can complain about code that mismatches __user
and kernel pointers, e.g.:

custom-address-space-1.c: In function ‘test_argpass_to_p’:
custom-address-space-1.c:29:14: error: passing argument 1 of ‘accepts_p’
from pointer to non-enclosed address space
   29 |   accepts_p (p_user);
  |  ^~
custom-address-space-1.c:21:24: note: expected ‘void *’ but argument is
of type ‘__user void *’
   21 | extern void accepts_p (void *);
  |^~
custom-address-space-1.c: In function ‘test_cast_k_to_u’:
custom-address-space-1.c:135:12: warning: cast to ‘__user’ address space
pointer from disjoint generic address space pointer
  135 |   p_user = (void __user *)p_kernel;
  |^

The patch doesn't yet maintain a good distinction between implicit
target-specific address spaces and user-defined address spaces, has at

[PATCH 4a/6] analyzer: implement region::untrusted_p in terms of custom address spaces

2021-11-13 Thread David Malcolm via Gcc-patches
gcc/analyzer/ChangeLog:
(region::untrusted_p): New.

gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/test-uaccess.h: New header.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/region.cc   | 13 +
 gcc/testsuite/gcc.dg/analyzer/test-uaccess.h | 19 +++
 2 files changed, 32 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/test-uaccess.h

diff --git a/gcc/analyzer/region.cc b/gcc/analyzer/region.cc
index bb4f53b8802..b84504dbe42 100644
--- a/gcc/analyzer/region.cc
+++ b/gcc/analyzer/region.cc
@@ -666,6 +666,19 @@ region::symbolic_for_unknown_ptr_p () const
   return false;
 }
 
+/* Return true if accessing this region crosses a trust boundary
+   e.g. user-space memory as seen by an OS kernel.  */
+
+bool
+region::untrusted_p () const
+{
+  addr_space_t as = get_addr_space ();
+  /* FIXME: treat all non-generic address spaces as untrusted for now.  */
+  if (!ADDR_SPACE_GENERIC_P (as))
+return true;
+  return false;
+}
+
 /* region's ctor.  */
 
 region::region (complexity c, unsigned id, const region *parent, tree type)
diff --git a/gcc/testsuite/gcc.dg/analyzer/test-uaccess.h 
b/gcc/testsuite/gcc.dg/analyzer/test-uaccess.h
new file mode 100644
index 000..0500e20b22b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/test-uaccess.h
@@ -0,0 +1,19 @@
+/* Shared header for testcases for copy_from_user/copy_to_user.  */
+
+/* Adapted from include/linux/compiler.h  */
+
+#pragma GCC custom_address_space(__user)
+
+/* Adapted from include/asm-generic/uaccess.h  */
+
+extern int copy_from_user(void *to, const void __user *from, long n)
+  __attribute__((access (write_only, 1, 3),
+access (read_only, 2, 3),
+returns_zero_on_success
+));
+
+extern long copy_to_user(void __user *to, const void *from, unsigned long n)
+  __attribute__((access (write_only, 1, 3),
+access (read_only, 2, 3),
+returns_zero_on_success
+));
-- 
2.26.3



[PATCH 1b/6] Add __attribute__((untrusted))

2021-11-13 Thread David Malcolm via Gcc-patches
This patch adds a new:

  __attribute__((untrusted))

for use by the C front-end, intended for use by the Linux kernel for
use with "__user", but which could be used by other operating system
kernels, and potentialy by other projects.

Known issues:
- at least one TODO in handle_untrusted_attribute
- should it be permitted to dereference an untrusted pointer?  The patch
  currently allows this

gcc/c-family/ChangeLog:
* c-attribs.c (c_common_attribute_table): Add "untrusted".
(build_untrusted_type): New.
(handle_untrusted_attribute): New.
* c-pretty-print.c (pp_c_cv_qualifiers): Handle
TYPE_QUAL_UNTRUSTED.

gcc/c/ChangeLog:
* c-typeck.c (convert_for_assignment): Complain if the trust
levels vary when assigning a non-NULL pointer.

gcc/ChangeLog:
* doc/extend.texi (Common Type Attributes): Add "untrusted".
* print-tree.c (print_node): Handle TYPE_UNTRUSTED.
* tree-core.h (enum cv_qualifier): Add TYPE_QUAL_UNTRUSTED.
(struct tree_type_common): Assign one of the spare bits to a new
"untrusted_flag".
* tree.c (set_type_quals): Handle TYPE_QUAL_UNTRUSTED.
* tree.h (TYPE_QUALS): Likewise.
(TYPE_QUALS_NO_ADDR_SPACE): Likewise.
(TYPE_QUALS_NO_ADDR_SPACE_NO_ATOMIC): Likewise.

gcc/testsuite/ChangeLog:
* c-c++-common/attr-untrusted-1.c: New test.

Signed-off-by: David Malcolm 
---
 gcc/c-family/c-attribs.c  |  59 +++
 gcc/c-family/c-pretty-print.c |   2 +
 gcc/c/c-typeck.c  |  64 +++
 gcc/doc/extend.texi   |  25 +++
 gcc/print-tree.c  |   3 +
 gcc/testsuite/c-c++-common/attr-untrusted-1.c | 165 ++
 gcc/tree-core.h   |   6 +-
 gcc/tree.c|   1 +
 gcc/tree.h|  11 +-
 9 files changed, 332 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/attr-untrusted-1.c

diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index 007b928c54b..100c2dabab2 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -136,6 +136,7 @@ static tree handle_warn_unused_result_attribute (tree *, 
tree, tree, int,
 bool *);
 static tree handle_access_attribute (tree *, tree, tree, int, bool *);
 
+static tree handle_untrusted_attribute (tree *, tree, tree, int, bool *);
 static tree handle_sentinel_attribute (tree *, tree, tree, int, bool *);
 static tree handle_type_generic_attribute (tree *, tree, tree, int, bool *);
 static tree handle_alloc_size_attribute (tree *, tree, tree, int, bool *);
@@ -536,6 +537,8 @@ const struct attribute_spec c_common_attribute_table[] =
  handle_special_var_sec_attribute, 
attr_section_exclusions },
   { "access",1, 3, false, true, true, false,
  handle_access_attribute, NULL },
+  { "untrusted", 0, 0, false,  true, false, true,
+ handle_untrusted_attribute, NULL },
   /* Attributes used by Objective-C.  */
   { "NSObject",  0, 0, true, false, false, false,
  handle_nsobject_attribute, NULL },
@@ -5224,6 +5227,62 @@ build_attr_access_from_parms (tree parms, bool 
skip_voidptr)
   return build_tree_list (name, attrargs);
 }
 
+/* Build (or reuse) a type based on BASE_TYPE, but with
+   TYPE_QUAL_UNTRUSTED.  */
+
+static tree
+build_untrusted_type (tree base_type)
+{
+  int base_type_quals = TYPE_QUALS (base_type);
+  return build_qualified_type (base_type,
+  base_type_quals | TYPE_QUAL_UNTRUSTED);
+}
+
+/* Handle an "untrusted" attribute; arguments as in
+   struct attribute_spec.handler.  */
+
+static tree
+handle_untrusted_attribute (tree *node, tree ARG_UNUSED (name),
+   tree ARG_UNUSED (args), int ARG_UNUSED (flags),
+   bool *no_add_attrs)
+{
+  if (TREE_CODE (*node) == POINTER_TYPE)
+{
+  tree base_type = TREE_TYPE (*node);
+  tree untrusted_base_type = build_untrusted_type (base_type);
+  *node = build_pointer_type (untrusted_base_type);
+  *no_add_attrs = true; /* OK */
+  return NULL_TREE;
+}
+  else if (TREE_CODE (*node) == FUNCTION_TYPE)
+{
+  tree return_type = TREE_TYPE (*node);
+  if (TREE_CODE (return_type) == POINTER_TYPE)
+   {
+ tree base_type = TREE_TYPE (return_type);
+ tree untrusted_base_type = build_untrusted_type (base_type);
+ tree untrusted_return_type = build_pointer_type (untrusted_base_type);
+ tree fn_type = build_function_type (untrusted_return_type,
+ TYPE_ARG_TYPES (*node));
+ *node = fn_type;
+ *no_add_attrs = true; /* OK */
+ return 

[PATCH 2/6] Add returns_zero_on_success/failure attributes

2021-11-13 Thread David Malcolm via Gcc-patches
This patch adds two new attributes.  The followup patch makes use of
the attributes in -fanalyzer.

gcc/c-family/ChangeLog:
* c-attribs.c (attr_noreturn_exclusions): Add
"returns_zero_on_failure" and "returns_zero_on_success".
(attr_returns_twice_exclusions): Likewise.
(attr_returns_zero_on_exclusions): New.
(c_common_attribute_table): Add "returns_zero_on_failure" and
"returns_zero_on_success".
(handle_returns_zero_on_attributes): New.

gcc/ChangeLog:
* doc/extend.texi (Common Function Attributes): Document
"returns_zero_on_failure" and "returns_zero_on_success".

gcc/testsuite/ChangeLog:
* c-c++-common/attr-returns-zero-on-1.c: New test.

Signed-off-by: David Malcolm 
---
 gcc/c-family/c-attribs.c  | 37 ++
 gcc/doc/extend.texi   | 16 +
 .../c-c++-common/attr-returns-zero-on-1.c | 68 +++
 3 files changed, 121 insertions(+)
 create mode 100644 gcc/testsuite/c-c++-common/attr-returns-zero-on-1.c

diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index 100c2dabab2..9e03156de5e 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -153,6 +153,7 @@ static tree handle_argspec_attribute (tree *, tree, tree, 
int, bool *);
 static tree handle_fnspec_attribute (tree *, tree, tree, int, bool *);
 static tree handle_warn_unused_attribute (tree *, tree, tree, int, bool *);
 static tree handle_returns_nonnull_attribute (tree *, tree, tree, int, bool *);
+static tree handle_returns_zero_on_attributes (tree *, tree, tree, int, bool 
*);
 static tree handle_omp_declare_simd_attribute (tree *, tree, tree, int,
   bool *);
 static tree handle_omp_declare_variant_attribute (tree *, tree, tree, int,
@@ -221,6 +222,8 @@ extern const struct attribute_spec::exclusions 
attr_noreturn_exclusions[] =
   ATTR_EXCL ("pure", true, true, true),
   ATTR_EXCL ("returns_twice", true, true, true),
   ATTR_EXCL ("warn_unused_result", true, true, true),
+  ATTR_EXCL ("returns_zero_on_failure", true, true, true),
+  ATTR_EXCL ("returns_zero_on_success", true, true, true),
   ATTR_EXCL (NULL, false, false, false),
 };
 
@@ -235,6 +238,8 @@ attr_warn_unused_result_exclusions[] =
 static const struct attribute_spec::exclusions attr_returns_twice_exclusions[] 
=
 {
   ATTR_EXCL ("noreturn", true, true, true),
+  ATTR_EXCL ("returns_zero_on_failure", true, true, true),
+  ATTR_EXCL ("returns_zero_on_success", true, true, true),
   ATTR_EXCL (NULL, false, false, false),
 };
 
@@ -275,6 +280,16 @@ static const struct attribute_spec::exclusions 
attr_stack_protect_exclusions[] =
   ATTR_EXCL (NULL, false, false, false),
 };
 
+/* Exclusions that apply to the returns_zero_on_* attributes.  */
+static const struct attribute_spec::exclusions
+  attr_returns_zero_on_exclusions[] =
+{
+  ATTR_EXCL ("noreturn", true, true, true),
+  ATTR_EXCL ("returns_twice", true, true, true),
+  ATTR_EXCL ("returns_zero_on_failure", true, true, true),
+  ATTR_EXCL ("returns_zero_on_success", true, true, true),
+  ATTR_EXCL (NULL, false, false, false),
+};
 
 /* Table of machine-independent attributes common to all C-like languages.
 
@@ -493,6 +508,12 @@ const struct attribute_spec c_common_attribute_table[] =
  handle_warn_unused_attribute, NULL },
   { "returns_nonnull",0, 0, false, true, true, false,
  handle_returns_nonnull_attribute, NULL },
+  { "returns_zero_on_failure",0, 0, false, true, true, false,
+ handle_returns_zero_on_attributes,
+ attr_returns_zero_on_exclusions },
+  { "returns_zero_on_success",0, 0, false, true, true, false,
+ handle_returns_zero_on_attributes,
+ attr_returns_zero_on_exclusions },
   { "omp declare simd",   0, -1, true,  false, false, false,
  handle_omp_declare_simd_attribute, NULL },
   { "omp declare variant base", 0, -1, true,  false, false, false,
@@ -5660,6 +5681,22 @@ handle_returns_nonnull_attribute (tree *node, tree name, 
tree, int,
   return NULL_TREE;
 }
 
+/* Handle "returns_zero_on_failure" and "returns_zero_on_success" attributes;
+   arguments as in struct attribute_spec.handler.  */
+
+static tree
+handle_returns_zero_on_attributes (tree *node, tree name, tree, int,
+  bool *no_add_attrs)
+{
+  if (!INTEGRAL_TYPE_P (TREE_TYPE (*node)))
+{
+  error ("%qE attribute on a function not returning an integral type",
+name);
+  *no_add_attrs = true;
+}
+  return NULL_TREE;
+}
+
 /* Handle a "designated_init" attribute; arguments as in
struct attribute_spec.handler.  */
 
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index e9f47519df2..5a6ef464779 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -3784,6 

Re: [PATCH] pch: Add support for PCH for relocatable executables

2021-11-13 Thread Iain Sandoe
Hi Folks,

IMO both this series
 - which restores the ability to work with PIE exes but requires a known 
address for the PCH 
and the series I posted
 - which allows a configuration to opt out of PCH anyway

could be useful - for Darwin I prefer this series.

of course, it would be very nice to have a relocatable impl (or the tree 
streamer) .. I fear
that relying on finding a fixed hole in the VM addresses is probably fragile 
w.r.t OS updates.

> On 10 Nov 2021, at 20:24, Iain Sandoe  wrote:

>> On 10 Nov 2021, at 08:14, Iain Sandoe  wrote:
> 
>>> On 9 Nov 2021, at 12:18, Jakub Jelinek via Gcc-patches 
>>>  wrote:
>>> 
>>> On Tue, Nov 09, 2021 at 11:40:08AM +, Iain Sandoe wrote:
 There were two issues, of which one remains and probably affects all 
 targets.
> 
 2. This problem remains.
> 
> This problem is also present on master without making any changes to the PCH
> implementation - if one fixes up the read-in to simulate a corrupted file, 
> cc1 hangs
> 
> (which means it’s no barrier to the revised PCH implementation)


>> That seems reasonable for the case that we call fatal_error from ggc-common, 
>> but
>> I don’t think it will work if fancy_abort is called (for e.g. a segv) - we 
>> might need to 
>> make a local fancy_abort() as well for that specific file, perhaps.
>> 
>> Or in some way defer overwriting the data until we’ve succeeded in 
>> reading/relocating
>> the whole file (not sure what the largest PCH is we might encounter).

> 
> (answering my own question) around 150Mb for largest libstdc++ and similar 
> for an 
> Objective-C include of Foundation + AppKit etc.
> 
> The underlying reason here is that diagnostics have become much more 
> sophisticated,
> and they do all sorts of context checking and include the libcpp stuff 
> directly which is a lot
> of GTY(()) stuff.
> 
> I cannot immediately see any small set of state that we can save / restore 
> around the
> PCH read in,

I was wrong about that… patch posted that fixes most of this issue.


===

To add to Jakub's two patches that do the heavy lifting - two configure changes 
(I have also
darwin-local changes which are under test at the moment with the intention to 
apply them
anyway).





0001-configure-gcc-Add-enable-pie-tools.patch
Description: Binary data


0002-configure-Add-top-level-configure-support-for-enable.patch
Description: Binary data


[PATCH] tree-optimization: [PR103218] Fold ((type)(a<0)) << SIGNBITOFA into ((type)a) & signbit

2021-11-13 Thread apinski--- via Gcc-patches
From: Andrew Pinski 

This folds Fold ((type)(a<0)) << SIGNBITOFA into ((type)a) & signbit inside 
match.pd.
This was already handled in fold-cost by:
/* A < 0 ?  : 0 is simply (A & ).  */
I have not removed as we only simplify "a ? POW2 : 0" at the gimple level to "a 
<< CST1"
and fold actually does the reverse of folding "(a<0)<> C) into -(x > 0) where C = precision(type) - 1.  */
 (for cst (INTEGER_CST VECTOR_CST)
  (simplify
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr103218-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr103218-1.c
new file mode 100644
index 000..f086f073b38
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr103218-1.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+/* PR tree-optimization/103218 */
+
+/* These first two are removed during forwprop1 */
+signed char f(signed char a)
+{
+  signed char t = a < 0;
+  int tt = (unsigned char)(t << 7);
+  return tt;
+}
+signed char f0(signed char a)
+{
+  unsigned char t = a < 0;
+  int tt = (unsigned char)(t << 7);
+  return tt;
+}
+
+/* This one is removed during phiopt. */
+signed char  f1(signed char a)
+{
+if (a < 0)
+  return 1u<<7;
+return 0;
+}
+
+/* These three examples should remove "a < 0" by optimized. */
+/* { dg-final { scan-tree-dump-times "< 0" 0 "optimized"} } */
-- 
2.17.1



committed: [PATCH] fixincludes: simplify handling for access() failure [PR21283, PR80047]

2021-11-13 Thread Xi Ruoyao via Gcc-patches
On Sat, 2021-11-13 at 08:13 -0800, Bruce Korb wrote:
> Perfect.

Committed at r12-5234 with minor format fix.

> On 11/12/21 1:58 PM, Xi Ruoyao wrote:
> > diff --git a/fixincludes/fixincl.c b/fixincludes/fixincl.c
> > index 6dba2f6e830..ee57fbf61b4 100644
> > --- a/fixincludes/fixincl.c
> > +++ b/fixincludes/fixincl.c
> > @@ -1352,11 +1352,10 @@ process (void)
> >   
> >     if (access (pz_curr_file, R_OK) != 0)
> >   {
> > -  int erno = errno;
> > -  fprintf (stderr, "Cannot access %s from %s\n\terror %d
> > (%s)\n",
> > -   pz_curr_file, getcwd ((char *) NULL, MAXPATHLEN),
> > -   erno, xstrerror (erno));
> > -  return;
> > +  /* Some really strange error happened. */
> > +  fprintf (stderr, "Cannot access %s: %s\n", pz_curr_file,
> > +  xstrerror (errno));
> > +  abort();
> >   }
> >   
> >     pz_curr_data = load_file (pz_curr_file);

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Cleanup modref_access_node

2021-11-13 Thread Jan Hubicka via Gcc-patches
Hi,
this patch moves member functions of modref_access_node from ipa-modref-tree.h
to ipa-modref-tree.c since they become long and not fitting for inlines anyway.
I also cleaned up the interface by making static insert method (which handles
inserting accesses into a vector and optimizing them) which makes it 
possible to hide most of the interface handling interval merging private.

Honza

gcc/ChangeLog:

* ipa-modref-tree.h 
(struct modref_access_node): Move longer member functions to 
ipa-modref-tree.c
(modref_ref_node::try_merge_with): Turn into modreef_acces_node member
function.
* ipa-modref-tree.c (modref_access_node::contains): Move here
from ipa-modref-tree.h.
(modref_access_node::update): Likewise.
(modref_access_node::merge): Likewise.
(modref_access_node::closer_pair_p): Likewise.
(modref_access_node::forced_merge): Likewise.
(modref_access_node::update2): Likewise.
(modref_access_node::combined_offsets): Likewise.
(modref_access_node::try_merge_with): Likewise.
(modref_access_node::insert): Likewise.

diff --git a/gcc/ipa-modref-tree.c b/gcc/ipa-modref-tree.c
index d0ee487f9fa..e363c506a09 100644
--- a/gcc/ipa-modref-tree.c
+++ b/gcc/ipa-modref-tree.c
@@ -28,6 +28,541 @@ along with GCC; see the file COPYING3.  If not see
 
 #if CHECKING_P
 
+/* Return true if both accesses are the same.  */
+bool
+modref_access_node::operator == (modref_access_node ) const
+{
+  if (parm_index != a.parm_index)
+return false;
+  if (parm_index != MODREF_UNKNOWN_PARM)
+{
+  if (parm_offset_known != a.parm_offset_known)
+   return false;
+  if (parm_offset_known
+ && !known_eq (parm_offset, a.parm_offset))
+   return false;
+}
+  if (range_info_useful_p () != a.range_info_useful_p ())
+return false;
+  if (range_info_useful_p ()
+  && (!known_eq (a.offset, offset)
+ || !known_eq (a.size, size)
+ || !known_eq (a.max_size, max_size)))
+return false;
+  return true;
+}
+
+/* Return true A is a subaccess.  */
+bool
+modref_access_node::contains (const modref_access_node ) const
+{
+  poly_int64 aoffset_adj = 0;
+  if (parm_index != MODREF_UNKNOWN_PARM)
+{
+  if (parm_index != a.parm_index)
+   return false;
+  if (parm_offset_known)
+   {
+  if (!a.parm_offset_known)
+return false;
+  /* Accesses are never below parm_offset, so look
+ for smaller offset.
+ If access ranges are known still allow merging
+ when bit offsets comparsion passes.  */
+  if (!known_le (parm_offset, a.parm_offset)
+  && !range_info_useful_p ())
+return false;
+  /* We allow negative aoffset_adj here in case
+ there is an useful range.  This is because adding
+ a.offset may result in non-ngative offset again.
+ Ubsan fails on val << LOG_BITS_PER_UNIT where val
+ is negative.  */
+  aoffset_adj = (a.parm_offset - parm_offset)
+* BITS_PER_UNIT;
+   }
+}
+  if (range_info_useful_p ())
+{
+  if (!a.range_info_useful_p ())
+   return false;
+  /* Sizes of stores are used to check that object is big enough
+to fit the store, so smaller or unknown sotre is more general
+than large store.  */
+  if (known_size_p (size)
+ && (!known_size_p (a.size)
+ || !known_le (size, a.size)))
+   return false;
+  if (known_size_p (max_size))
+   return known_subrange_p (a.offset + aoffset_adj,
+a.max_size, offset, max_size);
+  else
+   return known_le (offset, a.offset + aoffset_adj);
+}
+  return true;
+}
+
+/* Update access range to new parameters.
+   If RECORD_ADJUSTMENTS is true, record number of changes in the access
+   and if threshold is exceeded start dropping precision
+   so only constantly many updates are possible.  This makes dataflow
+   to converge.  */
+void
+modref_access_node::update (poly_int64 parm_offset1,
+   poly_int64 offset1, poly_int64 size1,
+   poly_int64 max_size1, bool record_adjustments)
+{
+  if (known_eq (parm_offset, parm_offset1)
+  && known_eq (offset, offset1)
+  && known_eq (size, size1)
+  && known_eq (max_size, max_size1))
+return;
+  if (!record_adjustments
+  || (++adjustments) < param_modref_max_adjustments)
+{
+  parm_offset = parm_offset1;
+  offset = offset1;
+  size = size1;
+  max_size = max_size1;
+}
+  else
+{
+  if (dump_file)
+   fprintf (dump_file,
+"--param param=modref-max-adjustments limit reached:");
+  if (!known_eq (parm_offset, parm_offset1))
+   {
+ if (dump_file)
+   fprintf (dump_file, " parm_offset cleared");
+ parm_offset_known = false;
+   }
+  if 

Re: [committed] openmp: Add support for 2 argument num_teams clause

2021-11-13 Thread H.J. Lu via Gcc-patches
On Thu, Nov 11, 2021 at 1:12 AM Jakub Jelinek via Gcc-patches
 wrote:
>
> Hi!
>
> In OpenMP 5.1, num_teams clause can accept either one expression as before,
> but it in that case changed meaning, rather than create <= expression
> teams it is now create == expression teams.  Or it accepts two expressions
> separated by :, with the meaning that the first is low bound and second upper
> bound on how many teams should be created.  The other ways to set number of
> teams are upper bounds with lower bound of 1.
>
> The following patch does parsing of this for C/C++.  For host teams, we
> actually don't need to do anything further right now, we always create
> (pretend to create) exactly the requested number of teams, so we can just
> evaluate and throw away the lower bound for now.
> For teams nested in target, we don't guarantee that though and further
> work will be needed.
> In particular, omplower now turns the teams part of:
> struct S { S (); S (const S &); ~S (); int s; };
> void bar (S &, S &);
> int baz ();
> _Pragma ("omp declare target to (baz)");
>
> void
> foo (void)
> {
>   S a, b;
>   #pragma omp target private (a) map (b)
>   {
> #pragma omp teams firstprivate (b) num_teams (baz ())
> {
>   bar (a, b);
> }
>   }
> }
> into:
>   retval.0 = baz ();
>   retval.1 = retval.0;
>   {
> unsigned int retval.3;
> struct S * D.2549;
> struct S b;
>
> retval.3 = (unsigned int) retval.1;
> D.2549 = .omp_data_i->b;
> S::S (, D.2549);
> #pragma omp teams num_teams(retval.1) firstprivate(b) shared(a)
> __builtin_GOMP_teams (retval.3, 0);
> {
>   bar (, );
> }
> S::~S ();
> #pragma omp return(nowait)
>   }
> IMHO we want a new API, say GOMP_teams3 which will take 3 arguments
> instead of 2 (the lower and upper bounds from num_teams and thread_limit)
> and will return a bool whether it should do the teams body or not.
> And, we should add right before outermost {} above
> while (__builtin_GOMP_teams3 ((unsigned) retval.1, (unsigned) retval.1, 0))
> and remove the __builtin_GOMP_teams call.  The current function performs
> exit equivalent (at least on NVPTX) which seems bad because that means
> the destructors of e.g. private variables on target aren't invoked, and
> at the current placement neither destructors of the already constructed
> privatized variables in teams.
> I'll do this next on the compiler side, but I'm afraid I'll need help
> with the nvptx and amdgcn implementations.  E.g. for nvptx, we won't be
> able to use %ctaid.x .  I think ideal would be to use a .shared
> integer variable for the omp_get_team_num value, but I don't have any
> experience with that, are .shared variables zero initialized by default,
> or do they have random value at start?  PTX docs say they aren't 
> initializable.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.
>
> 2021-11-11  Jakub Jelinek  
>
> gcc/
> * tree.h (OMP_CLAUSE_NUM_TEAMS_EXPR): Rename to ...
> (OMP_CLAUSE_NUM_TEAMS_UPPER_EXPR): ... this.
> (OMP_CLAUSE_NUM_TEAMS_LOWER_EXPR): Define.
> * tree.c (omp_clause_num_ops): Increase num ops for
> OMP_CLAUSE_NUM_TEAMS to 2.
> * tree-pretty-print.c (dump_omp_clause): Print optional lower bound
> for OMP_CLAUSE_NUM_TEAMS.
> * gimplify.c (gimplify_scan_omp_clauses): Gimplify
> OMP_CLAUSE_NUM_TEAMS_LOWER_EXPR if non-NULL.
> (optimize_target_teams): Use OMP_CLAUSE_NUM_TEAMS_UPPER_EXPR instead
> of OMP_CLAUSE_NUM_TEAMS_EXPR.  Handle OMP_CLAUSE_NUM_TEAMS_LOWER_EXPR.
> * omp-low.c (lower_omp_teams): Use OMP_CLAUSE_NUM_TEAMS_UPPER_EXPR
> instead of OMP_CLAUSE_NUM_TEAMS_EXPR.
> * omp-expand.c (expand_teams_call, get_target_arguments): Likewise.
> gcc/c/
> * c-parser.c (c_parser_omp_clause_num_teams): Parse optional
> lower-bound and store it into OMP_CLAUSE_NUM_TEAMS_LOWER_EXPR.
> Use OMP_CLAUSE_NUM_TEAMS_UPPER_EXPR instead of
> OMP_CLAUSE_NUM_TEAMS_EXPR.
> (c_parser_omp_target): For OMP_CLAUSE_NUM_TEAMS evaluate before
> combined target teams even lower-bound expression.
> gcc/cp/
> * parser.c (cp_parser_omp_clause_num_teams): Parse optional
> lower-bound and store it into OMP_CLAUSE_NUM_TEAMS_LOWER_EXPR.
> Use OMP_CLAUSE_NUM_TEAMS_UPPER_EXPR instead of
> OMP_CLAUSE_NUM_TEAMS_EXPR.
> (cp_parser_omp_target): For OMP_CLAUSE_NUM_TEAMS evaluate before
> combined target teams even lower-bound expression.
> * semantics.c (finish_omp_clauses): Handle
> OMP_CLAUSE_NUM_TEAMS_LOWER_EXPR of OMP_CLAUSE_NUM_TEAMS clause.
> * pt.c (tsubst_omp_clauses): Likewise.
> (tsubst_expr): For OMP_CLAUSE_NUM_TEAMS evaluate before
> combined target teams even lower-bound expression.
> gcc/fortran/
> * trans-openmp.c (gfc_trans_omp_clauses): Use
> OMP_CLAUSE_NUM_TEAMS_UPPER_EXPR instead of 

[PATCH 0/2] Sync with binutils for building binutils with LTO:

2021-11-13 Thread H.J. Lu via Gcc-patches
Add the --enable-pgo-build[=lto] configure option.  When binutils+gdb
is not built together with GCC, --enable-pgo-build enables the PGO build:

1. First build with -fprofile-generate.
2. Use "make maybe-check-*" to generate profiling data and pass -i to make
to ignore errors when generating profiling data.
3. Use "make clean" to remove the previous build.
4. Rebuild with -fprofile-use.

H.J. Lu (2):
  Sync with binutils: GCC: Pass --plugin to AR and RANLIB
  Sync with binutils: Support the PGO build for binutils+gdb

 Makefile.in|  68 ++--
 Makefile.tpl   |  63 +--
 config/gcc-plugin.m4   |  28 +
 configure  | 139 -
 configure.ac   |  80 
 libiberty/Makefile.in  |   5 +-
 libiberty/aclocal.m4   |   1 +
 libiberty/configure|  37 +++
 libiberty/configure.ac |  12 
 libtool.m4 |  25 +++-
 zlib/configure |  29 -
 11 files changed, 471 insertions(+), 16 deletions(-)

-- 
2.33.1



[PATCH 1/2] Sync with binutils: GCC: Pass --plugin to AR and RANLIB

2021-11-13 Thread H.J. Lu via Gcc-patches
Sync with binutils for building binutils with LTO:

>From 50ad1254d5030d0804cbf89c758359ae202e8d55 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Sat, 9 Jan 2021 06:43:11 -0800
Subject: [PATCH] GCC: Pass --plugin to AR and RANLIB

Detect GCC LTO plugin.  Pass --plugin to AR and RANLIB to support LTO
build.

* Makefile.tpl (AR): Add @AR_PLUGIN_OPTION@
(RANLIB): Add @RANLIB_PLUGIN_OPTION@.
* configure.ac: Include config/gcc-plugin.m4.
AC_SUBST AR_PLUGIN_OPTION and RANLIB_PLUGIN_OPTION.
* libtool.m4 (_LT_CMD_OLD_ARCHIVE): Pass --plugin to AR and
RANLIB if possible.
* Makefile.in: Regenerated.
* configure: Likewise.

config/

* gcc-plugin.m4 (GCC_PLUGIN_OPTION): New.

libiberty/

* Makefile.in (AR): Add @AR_PLUGIN_OPTION@
(RANLIB): Add @RANLIB_PLUGIN_OPTION@.
(configure_deps): Depend on ../config/gcc-plugin.m4.
* configure.ac: AC_SUBST AR_PLUGIN_OPTION and
RANLIB_PLUGIN_OPTION.
* aclocal.m4: Regenerated.
* configure: Likewise.

zlib/

* configure: Regenerated.
---
 Makefile.in|  5 +++--
 Makefile.tpl   |  5 +++--
 config/gcc-plugin.m4   | 28 
 configure  | 39 +++
 configure.ac   | 15 +++
 libiberty/Makefile.in  |  5 +++--
 libiberty/aclocal.m4   |  1 +
 libiberty/configure| 37 +
 libiberty/configure.ac | 12 
 libtool.m4 | 25 -
 zlib/configure | 29 ++---
 11 files changed, 191 insertions(+), 10 deletions(-)

diff --git a/Makefile.in b/Makefile.in
index 860cf8f067b..13067e97327 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -415,7 +415,7 @@ MAKEINFOFLAGS = --split-size=500
 # -
 
 AS = @AS@
-AR = @AR@
+AR = @AR@ @AR_PLUGIN_OPTION@
 AR_FLAGS = rc
 CC = @CC@
 CXX = @CXX@
@@ -426,7 +426,7 @@ LIPO = @LIPO@
 NM = @NM@
 OBJDUMP = @OBJDUMP@
 OTOOL = @OTOOL@
-RANLIB = @RANLIB@
+RANLIB = @RANLIB@ @RANLIB_PLUGIN_OPTION@
 READELF = @READELF@
 STRIP = @STRIP@
 WINDRES = @WINDRES@
@@ -63384,6 +63384,7 @@ AUTOCONF = autoconf
 $(srcdir)/configure: @MAINT@ $(srcdir)/configure.ac $(srcdir)/config/acx.m4 \
$(srcdir)/config/override.m4 $(srcdir)/config/proginstall.m4 \
$(srcdir)/config/elf.m4 $(srcdir)/config/isl.m4 \
+   $(srcdir)/config/gcc-plugin.m4 \
$(srcdir)/libtool.m4 $(srcdir)/ltoptions.m4 $(srcdir)/ltsugar.m4 \
$(srcdir)/ltversion.m4 $(srcdir)/lt~obsolete.m4
cd $(srcdir) && $(AUTOCONF)
diff --git a/Makefile.tpl b/Makefile.tpl
index 213052f8226..f785b84ec9c 100644
--- a/Makefile.tpl
+++ b/Makefile.tpl
@@ -418,7 +418,7 @@ MAKEINFOFLAGS = --split-size=500
 # -
 
 AS = @AS@
-AR = @AR@
+AR = @AR@ @AR_PLUGIN_OPTION@
 AR_FLAGS = rc
 CC = @CC@
 CXX = @CXX@
@@ -429,7 +429,7 @@ LIPO = @LIPO@
 NM = @NM@
 OBJDUMP = @OBJDUMP@
 OTOOL = @OTOOL@
-RANLIB = @RANLIB@
+RANLIB = @RANLIB@ @RANLIB_PLUGIN_OPTION@
 READELF = @READELF@
 STRIP = @STRIP@
 WINDRES = @WINDRES@
@@ -2027,6 +2027,7 @@ AUTOCONF = autoconf
 $(srcdir)/configure: @MAINT@ $(srcdir)/configure.ac $(srcdir)/config/acx.m4 \
$(srcdir)/config/override.m4 $(srcdir)/config/proginstall.m4 \
$(srcdir)/config/elf.m4 $(srcdir)/config/isl.m4 \
+   $(srcdir)/config/gcc-plugin.m4 \
$(srcdir)/libtool.m4 $(srcdir)/ltoptions.m4 $(srcdir)/ltsugar.m4 \
$(srcdir)/ltversion.m4 $(srcdir)/lt~obsolete.m4
cd $(srcdir) && $(AUTOCONF)
diff --git a/config/gcc-plugin.m4 b/config/gcc-plugin.m4
index 8f278719118..c5b72e9a13d 100644
--- a/config/gcc-plugin.m4
+++ b/config/gcc-plugin.m4
@@ -124,3 +124,31 @@ AC_DEFUN([GCC_ENABLE_PLUGINS],
  fi
fi
 ])
+
+dnl
+dnl
+dnl GCC_PLUGIN_OPTION
+dnl(SHELL-CODE_HANDLER)
+dnl
+AC_DEFUN([GCC_PLUGIN_OPTION],[dnl
+AC_MSG_CHECKING([for -plugin option])
+
+plugin_names="liblto_plugin.so liblto_plugin-0.dll cyglto_plugin-0.dll"
+plugin_option=
+for plugin in $plugin_names; do
+  plugin_so=`${CC} ${CFLAGS} --print-prog-name $plugin`
+  if test x$plugin_so = x$plugin; then
+plugin_so=`${CC} ${CFLAGS} --print-file-name $plugin`
+  fi
+  if test x$plugin_so != x$plugin; then
+plugin_option="--plugin $plugin_so"
+break
+  fi
+done
+if test -n "$plugin_option"; then
+  $1="$plugin_option"
+  AC_MSG_RESULT($plugin_option)
+else
+  AC_MSG_RESULT([no])
+fi
+])
diff --git a/configure b/configure
index 58979d6e3b1..ea3111a8020 100755
--- a/configure
+++ b/configure
@@ -619,6 +619,8 @@ GFORTRAN_FOR_TARGET
 GCC_FOR_TARGET
 CXX_FOR_TARGET
 CC_FOR_TARGET
+RANLIB_PLUGIN_OPTION
+AR_PLUGIN_OPTION
 READELF
 OTOOL
 OBJDUMP
@@ -12600,6 +12602,43 @@ fi
 
 
 
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for -plugin option" >&5
+$as_echo_n "checking for -plugin option... " >&6; }
+
+plugin_names="liblto_plugin.so 

[PATCH 2/2] Sync with binutils: Support the PGO build for binutils+gdb

2021-11-13 Thread H.J. Lu via Gcc-patches
Sync with binutils for building binutils with LTO:

>From af019bfde9b13d628202fe58054ec7ff08d92a0f Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Sat, 9 Jan 2021 06:51:15 -0800
Subject: [PATCH] Support the PGO build for binutils+gdb

Add the --enable-pgo-build[=lto] configure option.  When binutils+gdb
is not built together with GCC, --enable-pgo-build enables the PGO build:

1. First build with -fprofile-generate.
2. Use "make maybe-check-*" to generate profiling data and pass -i to make
to ignore errors when generating profiling data.
3. Use "make clean" to remove the previous build.
4. Rebuild with -fprofile-use.

With --enable-pgo-build=lto, -flto=jobserver -ffat-lto-objects are used
together with -fprofile-generate and -fprofile-use.  Add '+' to the command
line for recursive make to support -flto=jobserver -ffat-lto-objects.

NB: --enable-pgo-build=lto enables the PGO build with LTO while
--enable-lto enables LTO support in toolchain.

PR binutils/26766
* Makefile.tpl (BUILD_CFLAGS): New.
(CFLAGS): Append $(BUILD_CFLAGS).
(CXXFLAGS): Likewise.
(PGO_BUILD_GEN_FLAGS_TO_PASS): New.
(PGO_BUILD_TRAINING_CFLAGS): Likewise.
(PGO_BUILD_TRAINING_CXXFLAGS): Likewise.
(PGO_BUILD_TRAINING_FLAGS_TO_PASS): Likewise.
(PGO_BUILD_TRAINING_MFLAGS): Likewise.
(PGO_BUILD_USE_FLAGS_TO_PASS): Likewise.
(PGO-TRAINING-TARGETS): Likewise.
(PGO_BUILD_TRAINING): Likewise.
(all): Add '+' to the command line for recursive make.  Support
the PGO build.
* configure.ac: Add --enable-pgo-build[=lto].
AC_SUBST PGO_BUILD_GEN_CFLAGS, PGO_BUILD_USE_CFLAGS and
PGO_BUILD_LTO_CFLAGS.  Enable the PGO build in Makefile.
* Makefile.in: Regenerated.
* configure: Likewise.
---
 Makefile.in  |  63 ++--
 Makefile.tpl |  58 --
 configure| 100 +--
 configure.ac |  65 +
 4 files changed, 280 insertions(+), 6 deletions(-)

diff --git a/Makefile.in b/Makefile.in
index 13067e97327..2b77a470694 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -444,6 +444,49 @@ LIBCXXFLAGS = $(CXXFLAGS) -fno-implicit-templates
 GOCFLAGS = $(CFLAGS)
 GDCFLAGS = $(CFLAGS)
 
+# Pass additional PGO and LTO compiler options to the PGO build.
+BUILD_CFLAGS = $(PGO_BUILD_CFLAGS) $(PGO_BUILD_LTO_CFLAGS)
+override CFLAGS += $(BUILD_CFLAGS)
+override CXXFLAGS += $(BUILD_CFLAGS)
+
+# Additional PGO and LTO compiler options to generate profiling data
+# for the PGO build.
+PGO_BUILD_GEN_FLAGS_TO_PASS = \
+   PGO_BUILD_CFLAGS="@PGO_BUILD_GEN_CFLAGS@" \
+   PGO_BUILD_LTO_CFLAGS="@PGO_BUILD_LTO_CFLAGS@"
+
+# NB: Filter out any compiler options which may fail PGO training runs.
+PGO_BUILD_TRAINING_CFLAGS:= \
+   $(filter-out -Werror=%,$(CFLAGS))
+PGO_BUILD_TRAINING_CXXFLAGS:=\
+   $(filter-out -Werror=%,$(CXXFLAGS))
+PGO_BUILD_TRAINING_CFLAGS:= \
+   $(filter-out -Wall,$(PGO_BUILD_TRAINING_CFLAGS))
+PGO_BUILD_TRAINING_CXXFLAGS:= \
+   $(filter-out -Wall,$(PGO_BUILD_TRAINING_CXXFLAGS))
+PGO_BUILD_TRAINING_CFLAGS:= \
+   $(filter-out -specs=%,$(PGO_BUILD_TRAINING_CFLAGS))
+PGO_BUILD_TRAINING_CXXFLAGS:= \
+   $(filter-out -specs=%,$(PGO_BUILD_TRAINING_CXXFLAGS))
+PGO_BUILD_TRAINING_FLAGS_TO_PASS = \
+   PGO_BUILD_TRAINING=yes \
+   CFLAGS_FOR_TARGET="$(PGO_BUILD_TRAINING_CFLAGS)" \
+   CXXFLAGS_FOR_TARGET="$(PGO_BUILD_TRAINING_CXXFLAGS)"
+
+# Ignore "make check" errors in PGO training runs.
+PGO_BUILD_TRAINING_MFLAGS = -i
+
+# Additional PGO and LTO compiler options to use profiling data for the
+# PGO build.
+PGO_BUILD_USE_FLAGS_TO_PASS = \
+   PGO_BUILD_CFLAGS="@PGO_BUILD_USE_CFLAGS@" \
+   PGO_BUILD_LTO_CFLAGS="@PGO_BUILD_LTO_CFLAGS@"
+
+# PGO training targets for the PGO build.  FIXME: Add gold tests to
+# training.
+PGO-TRAINING-TARGETS = binutils gas gdb ld sim
+PGO_BUILD_TRAINING = $(addprefix maybe-check-,$(PGO-TRAINING-TARGETS))
+
 CREATE_GCOV = create_gcov
 
 TFLAGS =
@@ -1091,6 +1134,12 @@ configure-target:  \
 
 # The target built for a native non-bootstrap build.
 .PHONY: all
+
+# --enable-pgo-build enables the PGO build.
+# 1. First build with -fprofile-generate.
+# 2. Use "make maybe-check-*" to generate profiling data.
+# 3. Use "make clean" to remove the previous build.
+# 4. Rebuild with -fprofile-use.
 all:
 @if gcc-bootstrap
[ -f stage_final ] || echo stage3 > stage_final
@@ -1099,7 +1148,7 @@ all:
$(MAKE) $(RECURSE_FLAGS_TO_PASS) `cat stage_final`-bubble
 @endif gcc-bootstrap
@: $(MAKE); $(unstage)
-   @r=`${PWD_COMMAND}`; export r; \
+   +@r=`${PWD_COMMAND}`; export r; \
s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \
 @if gcc-bootstrap
if [ -f stage_last ]; then \
@@ -1107,7 +1156,17 @@ all:
  $(MAKE) $(TARGET_FLAGS_TO_PASS) all-host all-target; \
else \
 @endif 

Re: [PATCH] fixincludes: simplify handling for access() failure [PR21283, PR80047]

2021-11-13 Thread Bruce Korb via Gcc-patches

Perfect.

On 11/12/21 1:58 PM, Xi Ruoyao wrote:

diff --git a/fixincludes/fixincl.c b/fixincludes/fixincl.c
index 6dba2f6e830..ee57fbf61b4 100644
--- a/fixincludes/fixincl.c
+++ b/fixincludes/fixincl.c
@@ -1352,11 +1352,10 @@ process (void)
  
if (access (pz_curr_file, R_OK) != 0)

  {
-  int erno = errno;
-  fprintf (stderr, "Cannot access %s from %s\n\terror %d (%s)\n",
-   pz_curr_file, getcwd ((char *) NULL, MAXPATHLEN),
-   erno, xstrerror (erno));
-  return;
+  /* Some really strange error happened. */
+  fprintf (stderr, "Cannot access %s: %s\n", pz_curr_file,
+  xstrerror (errno));
+  abort();
  }
  
pz_curr_data = load_file (pz_curr_file);


Re: [PATCH v2] IPA: Provide a mechanism to register static DTORs via cxa_atexit.

2021-11-13 Thread Jan Hubicka via Gcc-patches
> sheesh … EWRONGREVISEDPATCH
> 
> > On 5 Nov 2021, at 13:08, Iain Sandoe  wrote:
> > 
> > I tried enabling this on x86-64-linux (just for interest) and it seems to 
> > work
> > OK there too - but that testing revealed a thinko that didn’t show with a
> > a normal regstrap.
> 
> … now with the correct patch.
> 
> [PATCH v2] IPA: Provide a mechanism to register static DTORs via
>  cxa_atexit.
> 
> For at least one target (Darwin) the platform convention is to
> register static destructors (i.e. __attribute__((destructor)))
> with __cxa_atexit rather than placing them into a list that is
> run by some other mechanism.
> 
> This patch provides a target hook that allows a target to opt
> into this and handling for the process in ipa_cdtor_merge ().
> 
> When the mode is enabled (dtors_from_cxa_atexit is set) we:
> 
>  * Generate new CTORs to register static destructors with
>__cxa_atexit and add them to the existing list of CTORs;
>we then process the revised CTORs list.
> 
>  * We sort the DTORs into priority and then TU order, this
>means that they are registered in that order with
>__cxa_atexit () and therefore will be run in the reverse
>order.
> 
>  * Likewise, CTORs are sorted into priority and then TU order,
>which means that they will run in that order.
> 
> This matches the behavior of using init/fini (or
> mod_init_func/mod_term_func) sections.
> 
> Signed-off-by: Iain Sandoe 
> 
> gcc/ChangeLog:
> 
>   * config/darwin.h (TARGET_DTORS_FROM_CXA_ATEXIT): New.
>   * doc/tm.texi: Regenerated.
>   * doc/tm.texi.in: Add TARGET_DTORS_FROM_CXA_ATEXIT hook.
>   * ipa.c (ipa_discover_variable_flags):
>   (cgraph_build_static_cdtor_1): Return the built function
>   decl.
>   (build_cxa_atexit_decl): New.
>   (build_dso_handle_decl): New.
>   (build_cxa_dtor_registrations): New.
>   (compare_cdtor_tu_order): New.
>   (build_cxa_atexit_fns): New.
>   (ipa_cdtor_merge): If dtors_from_cxa_atexit is set,
>   process the DTORs/CTORs accordingly.
>   (pass_ipa_cdtor_merge::gate): Also run if
>   dtors_from_cxa_atexit is set.
>   * target.def (dtors_from_cxa_atexit): New hook.

OK, thanks!
Honza


Enable more type attributes for signature changes

2021-11-13 Thread Jan Hubicka via Gcc-patches
Hi,
this patch whitelists attributes that are safe for attribute changes and
also makes access attribute dropped if function sigunature is changed.
We could do better by updating the attribute, but doing so seems to be
bit snowballing since with LTO the warnings produced seems bit confused.
We would also like to output original name of function
instead of mangledname.constprop or so.  I looked into what attributes
are dorpped in bootstrap and it does not look too bad.

Bootstrapped/regtested x86_64-linux, will commit it shortly.

Honza

gcc/ChangeLog:

* ipa-fnsummary.c (compute_fn_summary): Use type_attribut_allowed_p
* ipa-param-manipulation.c 
(ipa_param_adjustments::type_attribute_allowed_p):
New member function.
(drop_type_attribute_if_params_changed_p): New function.
(build_adjusted_function_type): Use it.
* ipa-param-manipulation.h: Add type_attribute_allowed_p.

diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
index 94a80d3ec90..7e9201a554a 100644
--- a/gcc/ipa-fnsummary.c
+++ b/gcc/ipa-fnsummary.c
@@ -3141,8 +3141,8 @@ compute_fn_summary (struct cgraph_node *node, bool early)
  modref summaries.  */
for (tree list = TYPE_ATTRIBUTES (TREE_TYPE (node->decl));
list && !no_signature; list = TREE_CHAIN (list))
-if (!flag_ipa_modref
-|| !is_attribute_p ("fn spec", get_attribute_name (list)))
+   if (!ipa_param_adjustments::type_attribute_allowed_p
+   (get_attribute_name (list)))
   {
 if (dump_file)
{
diff --git a/gcc/ipa-param-manipulation.c b/gcc/ipa-param-manipulation.c
index 991db0d9b1b..29268fa5a58 100644
--- a/gcc/ipa-param-manipulation.c
+++ b/gcc/ipa-param-manipulation.c
@@ -279,6 +279,32 @@ fill_vector_of_new_param_types (vec *new_types, 
vec *otypes,
 }
 }
 
+/* Return false if given attribute should prevent type adjustments.  */
+
+bool
+ipa_param_adjustments::type_attribute_allowed_p (tree name)
+{
+  if ((is_attribute_p ("fn spec", name) && flag_ipa_modref)
+  || is_attribute_p ("access", name)
+  || is_attribute_p ("returns_nonnull", name)
+  || is_attribute_p ("assume_aligned", name)
+  || is_attribute_p ("nocf_check", name)
+  || is_attribute_p ("warn_unused_result", name))
+return true;
+  return false;
+}
+
+/* Return true if attribute should be dropped if parameter changed.  */
+
+static bool
+drop_type_attribute_if_params_changed_p (tree name)
+{
+  if (is_attribute_p ("fn spec", name)
+  || is_attribute_p ("access", name))
+return true;
+  return false;
+}
+
 /* Build and return a function type just like ORIG_TYPE but with parameter
types given in NEW_PARAM_TYPES - which can be NULL if, but only if,
ORIG_TYPE itself has NULL TREE_ARG_TYPEs.  If METHOD2FUNC is true, also make
@@ -337,16 +363,19 @@ build_adjusted_function_type (tree orig_type, vec 
*new_param_types,
   if (skip_return)
TREE_TYPE (new_type) = void_type_node;
 }
-  /* We only support one fn spec attribute on type.  Be sure to remove it.
- Once we support multiple attributes we will need to be able to unshare
- the list.  */
   if (args_modified && TYPE_ATTRIBUTES (new_type))
 {
-  gcc_checking_assert
- (!TREE_CHAIN (TYPE_ATTRIBUTES (new_type))
-  && (is_attribute_p ("fn spec",
- get_attribute_name (TYPE_ATTRIBUTES (new_type);
+  tree t = TYPE_ATTRIBUTES (new_type);
+  tree *last = _ATTRIBUTES (new_type);
   TYPE_ATTRIBUTES (new_type) = NULL;
+  for (;t; t = TREE_CHAIN (t))
+   if (!drop_type_attribute_if_params_changed_p
+   (get_attribute_name (t)))
+ {
+   *last = copy_node (t);
+   TREE_CHAIN (*last) = NULL;
+   last = _CHAIN (*last);
+ }
 }
 
   return new_type;
diff --git a/gcc/ipa-param-manipulation.h b/gcc/ipa-param-manipulation.h
index 9440cbfc56c..5adf8a22356 100644
--- a/gcc/ipa-param-manipulation.h
+++ b/gcc/ipa-param-manipulation.h
@@ -254,6 +254,7 @@ public:
   /* If true, make the function not return any value.  */
   bool m_skip_return;
 
+  static bool type_attribute_allowed_p (tree);
 private:
   ipa_param_adjustments () {}
 


[PATCH] Ada, Darwin : Use DSYMUTIL_FOR_TARGET in libgnat/gnarl builds.

2021-11-13 Thread Iain Sandoe via Gcc-patches
Most of the time we get away with using the dsymutil that is
installed with the latest Xcode, however for some cross-compilation
cases that does not work.

We now have the ability to specify the correct dsymutil to use for
the toolchain (--with-dsymutil=) and we should use that specified
tool for debug link.  Fixes cross-compilers from x86-64 to powerpc.

Tested on x86_64, i686 and with a cross from x86_64 -> powerpc, and
with a bootstrap on x86_64-linux.

OK for master?
thanks
Iain

Signed-off-by: Iain Sandoe 

gcc/ada/ChangeLog:

* gcc-interface/Makefile.in: Use DSYMUTIL_FOR_TARGET in
libgnat/libgnarl recipies.
---
 gcc/ada/gcc-interface/Makefile.in | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/ada/gcc-interface/Makefile.in 
b/gcc/ada/gcc-interface/Makefile.in
index 9df88097226..53d0739470a 100644
--- a/gcc/ada/gcc-interface/Makefile.in
+++ b/gcc/ada/gcc-interface/Makefile.in
@@ -800,8 +800,8 @@ gnatlib-shared-darwin:
libgnat$(soext)
cd $(RTSDIR); $(LN_S) libgnarl$(hyphen)$(LIBRARY_VERSION)$(soext) \
libgnarl$(soext)
-   cd $(RTSDIR); dsymutil libgnat$(hyphen)$(LIBRARY_VERSION)$(soext)
-   cd $(RTSDIR); dsymutil libgnarl$(hyphen)$(LIBRARY_VERSION)$(soext)
+   cd $(RTSDIR); $(DSYMUTIL_FOR_TARGET) 
libgnat$(hyphen)$(LIBRARY_VERSION)$(soext)
+   cd $(RTSDIR); $(DSYMUTIL_FOR_TARGET) 
libgnarl$(hyphen)$(LIBRARY_VERSION)$(soext)
 
 gnatlib-shared:
$(MAKE) $(FLAGS_TO_PASS) \
-- 
2.24.3 (Apple Git-128)



[committed] analyzer: add four new taint-based warnings

2021-11-13 Thread David Malcolm via Gcc-patches
The initial commit of the analyzer in GCC 10 had a single warning,
  -Wanalyzer-tainted-array-index
and required manually enabling the taint checker with
-fanalyzer-checker=taint (due to scaling issues).

This patch extends the taint detection to add four new taint-based
warnings:

  -Wanalyzer-tainted-allocation-size
 for e.g. attacker-controlled malloc/alloca
  -Wanalyzer-tainted-divisor
 for detecting where an attacker can inject a divide-by-zero
  -Wanalyzer-tainted-offset
 for attacker-controlled pointer offsets
  -Wanalyzer-tainted-size
 for e.g. attacker-controlled memset

and rewords all the warnings to talk about "attacker-controlled" values
rather than "tainted" values.

Unfortunately I haven't yet addressed the scaling issues, so all of
these still require -fanalyzer-checker=taint (in addition to -fanalyzer).

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as b9365b93212041f14a7f71ba8da5af4d82240dc6.

gcc/analyzer/ChangeLog:
* analyzer.opt (Wanalyzer-tainted-allocation-size): New.
(Wanalyzer-tainted-divisor): New.
(Wanalyzer-tainted-offset): New.
(Wanalyzer-tainted-size): New.
* engine.cc (impl_region_model_context::get_taint_map): New.
* exploded-graph.h (impl_region_model_context::get_taint_map):
New decl.
* program-state.cc (sm_state_map::get_state): Call
alt_get_inherited_state.
(sm_state_map::impl_set_state): Modify states within
compound svalues.
(program_state::impl_call_analyzer_dump_state): Undo casts.
(selftest::test_program_state_1): Update for new context param of
create_region_for_heap_alloc.
(selftest::test_program_state_merging): Likewise.
* region-model-impl-calls.cc (region_model::impl_call_alloca):
Likewise.
(region_model::impl_call_calloc): Likewise.
(region_model::impl_call_malloc): Likewise.
(region_model::impl_call_operator_new): Likewise.
(region_model::impl_call_realloc): Likewise.
* region-model.cc (region_model::check_region_access): Call
check_region_for_taint.
(region_model::get_representative_path_var_1): Handle binops.
(region_model::create_region_for_heap_alloc): Add "ctxt" param and
pass it to set_dynamic_extents.
(region_model::create_region_for_alloca): Likewise.
(region_model::set_dynamic_extents): Add "ctxt" param and use it
to call check_dynamic_size_for_taint.
(selftest::test_state_merging): Update for new context param of
create_region_for_heap_alloc.
(selftest::test_malloc_constraints): Likewise.
(selftest::test_malloc): Likewise.
(selftest::test_alloca): Likewise for create_region_for_alloca.
* region-model.h (region_model::create_region_for_heap_alloc): Add
"ctxt" param.
(region_model::create_region_for_alloca): Likewise.
(region_model::set_dynamic_extents): Likewise.
(region_model::check_dynamic_size_for_taint): New decl.
(region_model::check_region_for_taint): New decl.
(region_model_context::get_taint_map): New vfunc.
(noop_region_model_context::get_taint_map): New.
* sm-taint.cc: Remove include of "diagnostic-event-id.h"; add
includes of "gimple-iterator.h", "tristate.h", "selftest.h",
"ordered-hash-map.h", "cgraph.h", "cfg.h", "digraph.h",
"analyzer/supergraph.h", "analyzer/call-string.h",
"analyzer/program-point.h", "analyzer/store.h",
"analyzer/region-model.h", and "analyzer/program-state.h".
(enum bounds): Move to top of file.
(class taint_diagnostic): New.
(class tainted_array_index): Convert to subclass of taint_diagnostic.
(tainted_array_index::emit): Add CWE-129.  Reword warning to use
"attacker-controlled" rather than "tainted".
(tainted_array_index::describe_state_change): Move to
taint_diagnostic::describe_state_change.
(tainted_array_index::describe_final_event): Reword to use
"attacker-controlled" rather than "tainted".
(class tainted_offset): New.
(class tainted_size): New.
(class tainted_divisor): New.
(class tainted_allocation_size): New.
(taint_state_machine::alt_get_inherited_state): New.
(taint_state_machine::on_stmt): In assignment handling, remove
ARRAY_REF handling in favor of check_region_for_taint.  Add
detection of tainted divisors.
(taint_state_machine::get_taint): New.
(taint_state_machine::combine_states): New.
(region_model::check_region_for_taint): New.
(region_model::check_dynamic_size_for_taint): New.
* sm.h (state_machine::alt_get_inherited_state): New.

gcc/ChangeLog:
* doc/invoke.texi (Static Analyzer Options): Add
-Wno-analyzer-tainted-allocation-size,
-Wno-analyzer-tainted-divisor, 

Remember fnspec EAF flags in modref summary

2021-11-13 Thread Jan Hubicka via Gcc-patches
Hi,
this patch stores eaf flags from fnspec to modref summaries.  THis makes
them survive signature changes and also improves IPA propagation in case
modref is not able to autodetect given flag.

Bootstrapped/regtested x86_64-linux, comitted.

Honza

gcc/ChangeLog:

* attr-fnspec.h (attr_fnspec::arg_eaf_flags): Break out from ...
* gimple.c (gimple_call_arg_flags): ... here.
* ipa-modref.c (analyze_parms): Record flags known from fnspec.
(modref_merge_call_site_flags): Use arg_eaf_flags.

diff --git a/gcc/attr-fnspec.h b/gcc/attr-fnspec.h
index 1154c30e7b0..cd618cb342b 100644
--- a/gcc/attr-fnspec.h
+++ b/gcc/attr-fnspec.h
@@ -264,6 +264,29 @@ public:
 return str[1] == 'C' || str[1] == 'P';
   }
 
+  /* Return EAF flags for arg I.  */
+  int
+  arg_eaf_flags (unsigned int i)
+  {
+int flags = 0;
+
+if (!arg_specified_p (i))
+  ;
+else if (!arg_used_p (i))
+  flags = EAF_UNUSED;
+else
+  {
+   if (arg_direct_p (i))
+ flags |= EAF_NO_INDIRECT_READ | EAF_NO_INDIRECT_ESCAPE
+  | EAF_NOT_RETURNED_INDIRECTLY | EAF_NO_INDIRECT_CLOBBER;
+   if (arg_noescape_p (i))
+ flags |= EAF_NO_DIRECT_ESCAPE | EAF_NO_INDIRECT_ESCAPE;
+   if (arg_readonly_p (i))
+ flags |= EAF_NO_DIRECT_CLOBBER | EAF_NO_INDIRECT_CLOBBER;
+  }
+return flags;
+  }
+
   /* Check validity of the string.  */
   void verify ();
 
diff --git a/gcc/gimple.c b/gcc/gimple.c
index 1e0fad92e15..037c6e4c827 100644
--- a/gcc/gimple.c
+++ b/gcc/gimple.c
@@ -1567,22 +1567,7 @@ gimple_call_arg_flags (const gcall *stmt, unsigned arg)
   int flags = 0;
 
   if (fnspec.known_p ())
-{
-  if (!fnspec.arg_specified_p (arg))
-   ;
-  else if (!fnspec.arg_used_p (arg))
-   flags = EAF_UNUSED;
-  else
-   {
- if (fnspec.arg_direct_p (arg))
-   flags |= EAF_NO_INDIRECT_READ | EAF_NO_INDIRECT_ESCAPE
-| EAF_NOT_RETURNED_INDIRECTLY | EAF_NO_INDIRECT_CLOBBER;
- if (fnspec.arg_noescape_p (arg))
-   flags |= EAF_NO_DIRECT_ESCAPE | EAF_NO_INDIRECT_ESCAPE;
- if (fnspec.arg_readonly_p (arg))
-   flags |= EAF_NO_DIRECT_CLOBBER | EAF_NO_INDIRECT_CLOBBER;
-   }
-}
+flags = fnspec.arg_eaf_flags (arg);
   tree callee = gimple_call_fndecl (stmt);
   if (callee)
 {
diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
index 90985cc1326..669dbe45a3d 100644
--- a/gcc/ipa-modref.c
+++ b/gcc/ipa-modref.c
@@ -2476,6 +2476,14 @@ analyze_parms (modref_summary *summary, 
modref_summary_lto *summary_lto,
   /* Do the dataflow.  */
   eaf_analysis.propagate ();
 
+  tree attr = lookup_attribute ("fn spec",
+   TYPE_ATTRIBUTES
+ (TREE_TYPE (current_function_decl)));
+  attr_fnspec fnspec (attr
+ ? TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE (attr)))
+ : "");
+
+
   /* Store results to summaries.  */
   for (tree parm = DECL_ARGUMENTS (current_function_decl); parm; parm_index++,
parm = TREE_CHAIN (parm))
@@ -2502,6 +2510,18 @@ analyze_parms (modref_summary *summary, 
modref_summary_lto *summary_lto,
  continue;
}
   int flags = eaf_analysis.get_ssa_name_flags (name);
+  int attr_flags = fnspec.arg_eaf_flags (parm_index);
+
+  if (dump_file && (flags | attr_flags) != flags && !(flags & EAF_UNUSED))
+   {
+ fprintf (dump_file,
+  "  Flags for param %i combined with fnspec flags:",
+  (int)parm_index);
+ dump_eaf_flags (dump_file, attr_flags, false);
+ fprintf (dump_file, " determined: ");
+ dump_eaf_flags (dump_file, flags, true);
+   }
+  flags |= attr_flags;
 
   /* Eliminate useless flags so we do not end up storing unnecessary
 summaries.  */
@@ -2522,8 +2542,8 @@ analyze_parms (modref_summary *summary, 
modref_summary_lto *summary_lto,
   "  Flags for param %i combined with IPA pass:",
   (int)parm_index);
  dump_eaf_flags (dump_file, past, false);
- fprintf (dump_file, " local ");
- dump_eaf_flags (dump_file, flags | past, true);
+ fprintf (dump_file, " determined: ");
+ dump_eaf_flags (dump_file, flags, true);
}
  if (!(flags & EAF_UNUSED))
flags |= past;
@@ -2561,7 +2581,7 @@ analyze_parms (modref_summary *summary, 
modref_summary_lto *summary_lto,
  fprintf (dump_file,
   "  Retslot flags combined with IPA pass:");
  dump_eaf_flags (dump_file, past, false);
- fprintf (dump_file, " local ");
+ fprintf (dump_file, " determined: ");
  dump_eaf_flags (dump_file, flags, true);
}
   if (!(flags & EAF_UNUSED))
@@ -2591,7 +2611,7 @@ analyze_parms (modref_summary *summary, 
modref_summary_lto *summary_lto,
  fprintf (dump_file,
   "  

Re: [COMMITTED] path solver: Solve PHI imports first for ranges.

2021-11-13 Thread Aldy Hernandez via Gcc-patches
On Sat, Nov 13, 2021 at 12:55 PM Aldy Hernandez  wrote:
>
> On Sat, Nov 13, 2021 at 10:41 AM Aldy Hernandez  wrote:
> >
> > On Sat, Nov 13, 2021 at 1:51 AM Andrew MacLeod  wrote:
> > >
> > > On 11/12/21 14:50, Richard Biener via Gcc-patches wrote:
> > > > On November 12, 2021 8:46:25 PM GMT+01:00, Aldy Hernandez via 
> > > > Gcc-patches  wrote:
> > > >> PHIs must be resolved first while solving ranges in a block,
> > > >> regardless of where they appear in the import bitmap.  We went through
> > > >> a similar exercise for the relational code, but missed these.
> > > > Must not all stmts be resolved in program order (for optimality at 
> > > > least)?
> > >
> > > Generally,Imports are live on entry values to a block, so their order is
> > > not particularly important.. they are all simultaneous. PHIs are also
> > > considered imports for data flow purposes, but they happen before the
> > > first stmt, all simultaneously... they need to be distinguished because
> > > phi arguments can refer to other phi defs which may be in this block
> > > live around a back edge, and we need to be sure we get the right version.
> > >
> > > we should look closer to be sure this isn't an accidental fix that
> > > leaves the root problem .   we need to be sure *all* the PHI arguments
> > > are resolved from outside this block. whats the testcase?
> >
> > The testcase is the simpler testcase from the PR:
> >
> > https://gcc.gnu.org/bugzilla/attachment.cgi?id=51776
> >
> > The gist is on a path coming in from BB13:
> >
> > # n_42 = PHI 
> > # m_31 = PHI <0(13), m_16(4)>
> >
> > We were solving m_31 first and putting it in the cache, and then the
> > calculation for n_42 picked up this cached m_31 incorrectly.
> >
> > With my patch we do the PHIs first, in whatever gphi_iterator order
> > uses, which I assume is the order in the IL above.
> >
> > However, if PHIs must be resolved simultaneously, then perhaps we need
> > to tweak this.  Suppose we flip the definitions:
> >
> > # m_31 = PHI <0(13), m_16(4)>
> > # n_42 = PHI 
> >
> > I assume the definition of n_42 should pick up the incoming m_31(13),
> > not one defined in the other PHI.  In which case, we could resolve all
> > the PHIs first, but put them in the cache after we're done with all of
> > them.
>
> And lo and behold, a PR just came in exhibiting this exact behavior,
> saving me from having to come up with a reduced testcase ;-).
>
> The testcase in the PR has a path coming in from BB5:
>
> # p3_7 = PHI <1(2), 0(5)>
> # p2_17 = PHI <1(2), p3_7(5)>
>
> We're picking up the p3_7 in the PHI when calculating p2_17.
>
> Attached is the patch in testing.

Tested on x86-64 & ppc64le Linux.

Pushed.



Re: [COMMITTED] path solver: Solve PHI imports first for ranges.

2021-11-13 Thread Aldy Hernandez via Gcc-patches
On Sat, Nov 13, 2021 at 2:26 PM Richard Biener
 wrote:
>
> On November 13, 2021 10:41:02 AM GMT+01:00, Aldy Hernandez  
> wrote:
> >On Sat, Nov 13, 2021 at 1:51 AM Andrew MacLeod  wrote:
> >>
> >> On 11/12/21 14:50, Richard Biener via Gcc-patches wrote:
> >> > On November 12, 2021 8:46:25 PM GMT+01:00, Aldy Hernandez via 
> >> > Gcc-patches  wrote:
> >> >> PHIs must be resolved first while solving ranges in a block,
> >> >> regardless of where they appear in the import bitmap.  We went through
> >> >> a similar exercise for the relational code, but missed these.
> >> > Must not all stmts be resolved in program order (for optimality at 
> >> > least)?
> >>
> >> Generally,Imports are live on entry values to a block, so their order is
> >> not particularly important.. they are all simultaneous. PHIs are also
> >> considered imports for data flow purposes, but they happen before the
> >> first stmt, all simultaneously... they need to be distinguished because
> >> phi arguments can refer to other phi defs which may be in this block
> >> live around a back edge, and we need to be sure we get the right version.
> >>
> >> we should look closer to be sure this isn't an accidental fix that
> >> leaves the root problem .   we need to be sure *all* the PHI arguments
> >> are resolved from outside this block. whats the testcase?
> >
> >The testcase is the simpler testcase from the PR:
> >
> >https://gcc.gnu.org/bugzilla/attachment.cgi?id=51776
> >
> >The gist is on a path coming in from BB13:
> >
> ># n_42 = PHI 
> ># m_31 = PHI <0(13), m_16(4)>
> >
> >We were solving m_31 first and putting it in the cache, and then the
> >calculation for n_42 picked up this cached m_31 incorrectly.
> >
> >With my patch we do the PHIs first, in whatever gphi_iterator order
> >uses, which I assume is the order in the IL above.
> >
> >However, if PHIs must be resolved simultaneously, then perhaps we need
> >to tweak this.  Suppose we flip the definitions:
> >
> ># m_31 = PHI <0(13), m_16(4)>
> ># n_42 = PHI 
> >
> >I assume the definition of n_42 should pick up the incoming m_31(13),
> >not one defined in the other PHI.  In which case, we could resolve all
> >the PHIs first, but put them in the cache after we're done with all of
> >them.
>
> PHI order is irrelevant, they are executed in parallel, thus arguments pick 
> up the old value irrespective of order.
>

Ughh, yeah.  Just noticed, per my follow-up patch for PR103222.

Tested on x86-64 & ppc64le Linux, and pushed.

Thanks.
Aldy



Re: [COMMITTED] path solver: Solve PHI imports first for ranges.

2021-11-13 Thread Richard Biener via Gcc-patches
On November 13, 2021 10:41:02 AM GMT+01:00, Aldy Hernandez  
wrote:
>On Sat, Nov 13, 2021 at 1:51 AM Andrew MacLeod  wrote:
>>
>> On 11/12/21 14:50, Richard Biener via Gcc-patches wrote:
>> > On November 12, 2021 8:46:25 PM GMT+01:00, Aldy Hernandez via Gcc-patches 
>> >  wrote:
>> >> PHIs must be resolved first while solving ranges in a block,
>> >> regardless of where they appear in the import bitmap.  We went through
>> >> a similar exercise for the relational code, but missed these.
>> > Must not all stmts be resolved in program order (for optimality at least)?
>>
>> Generally,Imports are live on entry values to a block, so their order is
>> not particularly important.. they are all simultaneous. PHIs are also
>> considered imports for data flow purposes, but they happen before the
>> first stmt, all simultaneously... they need to be distinguished because
>> phi arguments can refer to other phi defs which may be in this block
>> live around a back edge, and we need to be sure we get the right version.
>>
>> we should look closer to be sure this isn't an accidental fix that
>> leaves the root problem .   we need to be sure *all* the PHI arguments
>> are resolved from outside this block. whats the testcase?
>
>The testcase is the simpler testcase from the PR:
>
>https://gcc.gnu.org/bugzilla/attachment.cgi?id=51776
>
>The gist is on a path coming in from BB13:
>
># n_42 = PHI 
># m_31 = PHI <0(13), m_16(4)>
>
>We were solving m_31 first and putting it in the cache, and then the
>calculation for n_42 picked up this cached m_31 incorrectly.
>
>With my patch we do the PHIs first, in whatever gphi_iterator order
>uses, which I assume is the order in the IL above.
>
>However, if PHIs must be resolved simultaneously, then perhaps we need
>to tweak this.  Suppose we flip the definitions:
>
># m_31 = PHI <0(13), m_16(4)>
># n_42 = PHI 
>
>I assume the definition of n_42 should pick up the incoming m_31(13),
>not one defined in the other PHI.  In which case, we could resolve all
>the PHIs first, but put them in the cache after we're done with all of
>them.

PHI order is irrelevant, they are executed in parallel, thus arguments pick up 
the old value irrespective of order. 

Richard. 
>
>Thoughts?
>Aldy
>



Re: [PATCH] rs6000: MMA test case emits wrong code when building a vector pair

2021-11-13 Thread Segher Boessenkool
On Wed, Oct 27, 2021 at 08:37:57PM -0500, Peter Bergner wrote:
> PR102976 shows a test case where we generate wrong code when building
> a vector pair from 2 vector registers.  The bug here is that with unlucky
> register assignments, we can clobber one of the input operands before
> we write both registers of the output operand.  The solution is to use
> early-clobbers in the assemble pair and accumulator patterns.

Because of what insns there are after the split.  Aha.

Please add a comment explaining this, near the earlyclobber itself.

A usually nicer way of doing it is by special casing the split code for
this situation.  But with the comment in place the way you do it might
even be preferable here :-)

> +/* { dg-final { scan-assembler-times {xxlor[^,]*,44,44} 1 } } */
> +/* { dg-final { scan-assembler-times {xxlor[^,]*,32,32} 1 } } */

Bracket expressions using ^ match newlines as well, unless you use
(partial) newline-sensitive matching.  Partial is almost always what you
want, so start the regex with (?p) ?  You also want to add some \m and
\M btw.  For example, as written his will match xxlorc insns as well.
Not a super big deal, but :-)

You can just write this as {\mxxlor \d+,44,44\M} etc., that will be
simplest I think.

Okay for trunk with comments added near the earlyclobber, and the RE
improved.  Also fine for 11 after some burn-in.  Thanks!


Segher


[PATCH][_GLIBCXX_DEBUG] Code cleanup/simplification

2021-11-13 Thread François Dumont via Gcc-patches

    libstdc++: [_GLIBCXX_DEBUG] Remove _Safe_container<>::_M_safe()

    Container code cleanup to get rid of _Safe_container<>::_M_safe() 
and just
    _Safe:: calls which use normal inheritance. Also remove several 
usages of _M_base()
    which can be most of the time ommitted and sometimes replace with 
explicit _Base::

    calls.

    libstdc++-v3/ChangeLog:

    * include/debug/safe_container.h 
(_Safe_container<>::_M_safe): Remove.
    * include/debug/deque 
(deque::operator=(initializer_list<>)): Replace

    _M_base() call with _Base:: call.
    (deque::operator[](size_type)): Likewise.
    * include/debug/forward_list (forward_list(forward_list&&, 
const allocator_type&):

    Remove _M_safe() and _M_base() calls.
    (forward_list::operator=(initializer_list<>)): Remove 
_M_base() calls.

    (forward_list::splice_after, forward_list::merge): Likewise.
    * include/debug/list (list(list&&, const allocator_type&)):
    Remove _M_safe() and _M_base() calls.
    (list::operator=(initializer_list<>)): Remove _M_base() calls.
    (list::splice, list::merge): Likewise.
    * include/debug/map.h (map(map&&, const allocator_type&)):
    Remove _M_safe() and _M_base() calls.
    (map::operator=(initializer_list<>)): Remove _M_base() calls.
    * include/debug/multimap.h (multimap(multimap&&, const 
allocator_type&)):

    Remove _M_safe() and _M_base() calls.
    (multimap::operator=(initializer_list<>)): Remove _M_base() 
calls.

    * include/debug/set.h (set(set&&, const allocator_type&)):
    Remove _M_safe() and _M_base() calls.
    (set::operator=(initializer_list<>)): Remove _M_base() calls.
    * include/debug/multiset.h (multiset(multiset&&, const 
allocator_type&)):

    Remove _M_safe() and _M_base() calls.
    (multiset::operator=(initializer_list<>)): Remove _M_base() 
calls.
    * include/debug/string (basic_string(basic_string&&, const 
allocator_type&)):

    Remove _M_safe() and _M_base() calls.
    (basic_string::operator=(initializer_list<>)): Remove 
_M_base() call.
    (basic_string::operator=(const _CharT*), 
basic_string::operator=(_CharT)): Likewise.
    (basic_string::operator[](size_type), 
basic_string::operator+=(const basic_string&)): Likewise.
    (basic_string::operator+=(const _Char*), 
basic_string::operator+=(_CharT)): Likewise.
    * include/debug/unordered_map 
(unordered_map(unordered_map&&, const allocator_type&)):

    Remove _M_safe() and _M_base() calls.
    (unordered_map::operator=(initializer_list<>), 
unordered_map::merge): Remove _M_base() calls.
    (unordered_multimap(unordered_multimap&&, const 
allocator_type&)):

    Remove _M_safe() and _M_base() calls.
(unordered_multimap::operator=(initializer_list<>), 
unordered_multimap::merge):

    Remove _M_base() calls.
    * include/debug/unordered_set 
(unordered_set(unordered_set&&, const allocator_type&)):

    Remove _M_safe() and _M_base() calls.
    (unordered_set::operator=(initializer_list<>), 
unordered_set::merge): Remove _M_base() calls.
    (unordered_multiset(unordered_multiset&&, const 
allocator_type&)):

    Remove _M_safe() and _M_base() calls.
(unordered_multiset::operator=(initializer_list<>), 
unordered_multiset::merge):

    Remove _M_base() calls.
    * include/debug/vector (vector(vector&&, const 
allocator_type&):

    Remove _M_safe() and _M_base() calls.
    (vector::operator=(initializer_list<>)): Remove _M_base() 
calls.

    (vector::operator[](size_type)): Likewise.

Tested under Linux x86_64 _GLIBCXX_DEBUG mode -std=gnu++14 and -std=gnu++98.

Ok to commit ?

François


diff --git a/libstdc++-v3/include/debug/deque b/libstdc++-v3/include/debug/deque
index 52778ba1617..a73d6c34209 100644
--- a/libstdc++-v3/include/debug/deque
+++ b/libstdc++-v3/include/debug/deque
@@ -166,7 +166,7 @@ namespace __debug
   deque&
   operator=(initializer_list __l)
   {
-	_M_base() = __l;
+	_Base::operator=(__l);
 	this->_M_invalidate_all();
 	return *this;
   }
@@ -344,7 +344,7 @@ namespace __debug
   operator[](size_type __n) _GLIBCXX_NOEXCEPT
   {
 	__glibcxx_check_subscript(__n);
-	return _M_base()[__n];
+	return _Base::operator[](__n);
   }
 
   _GLIBCXX_NODISCARD
@@ -352,7 +352,7 @@ namespace __debug
   operator[](size_type __n) const _GLIBCXX_NOEXCEPT
   {
 	__glibcxx_check_subscript(__n);
-	return _M_base()[__n];
+	return _Base::operator[](__n);
   }
 
   using _Base::at;
diff --git a/libstdc++-v3/include/debug/forward_list b/libstdc++-v3/include/debug/forward_list
index cae5b5f038b..6ed4853af40 100644
--- a/libstdc++-v3/include/debug/forward_list
+++ 

[PATCH] libsanitizer: Merge with upstream

2021-11-13 Thread H.J. Lu via Gcc-patches
Merged revision: 82bc6a094e85014f1891ef9407496f44af8fe442

with the fix for PR sanitizer/102911
---
 libsanitizer/MERGE|   2 +-
 libsanitizer/asan/asan_allocator.cpp  |  17 ++-
 libsanitizer/asan/asan_globals.cpp|  19 +++
 libsanitizer/asan/asan_interceptors.h |   7 +-
 libsanitizer/asan/asan_malloc_linux.cpp   | 115 --
 libsanitizer/asan/asan_mapping.h  |   2 +-
 libsanitizer/hwasan/hwasan.cpp|   2 +-
 .../hwasan/hwasan_allocation_functions.cpp|  59 +++--
 libsanitizer/hwasan/hwasan_exceptions.cpp |   4 +-
 libsanitizer/hwasan/hwasan_fuchsia.cpp|   2 +-
 libsanitizer/hwasan/hwasan_linux.cpp  |   2 +-
 libsanitizer/hwasan/hwasan_thread.cpp |  22 ++--
 libsanitizer/hwasan/hwasan_thread.h   |  10 +-
 libsanitizer/lsan/lsan_common.cpp |  31 ++---
 libsanitizer/lsan/lsan_common.h   |   9 +-
 libsanitizer/lsan/lsan_common_mac.cpp |   2 +-
 libsanitizer/lsan/lsan_interceptors.cpp   |  44 ---
 .../sanitizer_common/sanitizer_addrhashmap.h  |  38 ++
 .../sanitizer_allocator_combined.h|   6 +-
 .../sanitizer_allocator_dlsym.h   |  79 
 .../sanitizer_allocator_primary32.h   |   6 +-
 .../sanitizer_allocator_secondary.h   |   8 +-
 .../sanitizer_deadlock_detector.h |   2 +-
 .../sanitizer_common/sanitizer_linux.cpp  |  48 +---
 .../sanitizer_common/sanitizer_linux.h|  12 +-
 .../sanitizer_linux_libcdep.cpp   |   4 -
 .../sanitizer_common/sanitizer_mac.cpp|  15 +--
 libsanitizer/sanitizer_common/sanitizer_mac.h |  20 ---
 .../sanitizer_common/sanitizer_malloc_mac.inc |  20 ++-
 .../sanitizer_platform_interceptors.h |   6 +-
 .../sanitizer_platform_limits_linux.cpp   |   5 +-
 .../sanitizer_platform_limits_posix.h |   2 +-
 .../sanitizer_common/sanitizer_procmaps.h |  18 ++-
 .../sanitizer_common/sanitizer_stacktrace.cpp |  17 +--
 libsanitizer/tsan/tsan_interceptors_posix.cpp |  38 +-
 libsanitizer/tsan/tsan_rtl.cpp|   6 +-
 libsanitizer/tsan/tsan_rtl.h  |   2 +-
 libsanitizer/tsan/tsan_rtl_amd64.S|  74 +++
 libsanitizer/tsan/tsan_rtl_ppc64.S|   1 -
 libsanitizer/ubsan/ubsan_flags.cpp|   1 -
 libsanitizer/ubsan/ubsan_handlers.cpp |  15 ---
 libsanitizer/ubsan/ubsan_handlers.h   |   8 --
 libsanitizer/ubsan/ubsan_platform.h   |   2 -
 43 files changed, 469 insertions(+), 333 deletions(-)
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_allocator_dlsym.h

diff --git a/libsanitizer/MERGE b/libsanitizer/MERGE
index c3463bffbae..01913de5d66 100644
--- a/libsanitizer/MERGE
+++ b/libsanitizer/MERGE
@@ -1,4 +1,4 @@
-78d3e0a4f1406b17cdecc77540e09210670fe9a9
+82bc6a094e85014f1891ef9407496f44af8fe442
 
 The first line of this file holds the git revision number of the
 last merge done from the master library sources.
diff --git a/libsanitizer/asan/asan_allocator.cpp 
b/libsanitizer/asan/asan_allocator.cpp
index 6d7073710bd..3fa36742060 100644
--- a/libsanitizer/asan/asan_allocator.cpp
+++ b/libsanitizer/asan/asan_allocator.cpp
@@ -102,19 +102,18 @@ class ChunkHeader {
 
  public:
   uptr UsedSize() const {
-uptr R = user_requested_size_lo;
-if (sizeof(uptr) > sizeof(user_requested_size_lo))
-  R += (uptr)user_requested_size_hi << (8 * 
sizeof(user_requested_size_lo));
-return R;
+static_assert(sizeof(user_requested_size_lo) == 4,
+  "Expression below requires this");
+return FIRST_32_SECOND_64(0, ((uptr)user_requested_size_hi << 32)) +
+   user_requested_size_lo;
   }
 
   void SetUsedSize(uptr size) {
 user_requested_size_lo = size;
-if (sizeof(uptr) > sizeof(user_requested_size_lo)) {
-  size >>= (8 * sizeof(user_requested_size_lo));
-  user_requested_size_hi = size;
-  CHECK_EQ(user_requested_size_hi, size);
-}
+static_assert(sizeof(user_requested_size_lo) == 4,
+  "Expression below requires this");
+user_requested_size_hi = FIRST_32_SECOND_64(0, size >> 32);
+CHECK_EQ(UsedSize(), size);
   }
 
   void SetAllocContext(u32 tid, u32 stack) {
diff --git a/libsanitizer/asan/asan_globals.cpp 
b/libsanitizer/asan/asan_globals.cpp
index 94004877227..5f56fe6f457 100644
--- a/libsanitizer/asan/asan_globals.cpp
+++ b/libsanitizer/asan/asan_globals.cpp
@@ -154,6 +154,23 @@ static void CheckODRViolationViaIndicator(const Global *g) 
{
   }
 }
 
+// Check ODR violation for given global G by checking if it's already poisoned.
+// We use this method in case compiler doesn't use private aliases for global
+// variables.
+static void CheckODRViolationViaPoisoning(const Global *g) {
+  if (__asan_region_is_poisoned(g->beg, g->size_with_redzone)) {
+// This check may not be enough: if the first global is much larger
+

[PATCH] PCH: Make the save and restore diagnostics more robust.

2021-11-13 Thread Iain Sandoe via Gcc-patches
When saving, if we cannot obtain a suitable memory segment there
is no point in continuing, so exit with an error.

When reading in the PCH, we have a situation that the read-in
data will replace the line tables used by the diagnostics output.
However, the state of the read-in line tables is indeterminate
at some points where diagnostics might be needed.

To make this more robust, we save the existing line tables at
the start and, once we have read in the pointer to the new one,
put that to one side and restore the original table.  This
avoids compiler hangs if the read or memory acquisition code
issues an assert, fatal_error, segv etc.

Once the read is complete, we swap in the new line table that
came from the PCH.

If the read-in PCH is corrupted then we still have a broken
compilation w.r.t any future diagnostics - but there is little
that can be done about that without more careful validation of
the file.

I've tested this by hacking and rebuilding the compiler to
produce various kinds of failure.  At present, it is hard to
see how to make testcases to do this.  Now reg-testing on more
systems,

OK for master if reg-tests pass?
thanks
Iain

Signed-off-by: Iain Sandoe 

gcc/ChangeLog:

* ggc-common.c (gt_pch_save): If we cannot find a suitable
memory segment for save, then error-out, do not try to
continue.
(gt_pch_restore): Save the existing line table, and when
the replacement is being read, use that when constructing
diagnostics.
---
 gcc/ggc-common.c | 39 +--
 1 file changed, 37 insertions(+), 2 deletions(-)

diff --git a/gcc/ggc-common.c b/gcc/ggc-common.c
index 32ba5be42b2..b6abed1d9a2 100644
--- a/gcc/ggc-common.c
+++ b/gcc/ggc-common.c
@@ -440,6 +440,10 @@ gt_pch_save (FILE *f)
  (The extra work goes in HOST_HOOKS_GT_PCH_GET_ADDRESS and
  HOST_HOOKS_GT_PCH_USE_ADDRESS.)  */
   mmi.preferred_base = host_hooks.gt_pch_get_address (mmi.size, fileno (f));
+  /* If the host cannot supply any suitable address for this, we are stuck.  */
+  if (mmi.preferred_base == NULL)
+fatal_error (input_location,
+"cannot write PCH file: required memory segment unavailable");
 
   ggc_pch_this_base (state.d, mmi.preferred_base);
 
@@ -589,6 +593,13 @@ gt_pch_restore (FILE *f)
   struct mmap_info mmi;
   int result;
 
+  /* We are about to reload the line maps along with the rest of the PCH
+ data, which means that the (loaded) ones cannot be guaranteed to be
+ in any valid state for reporting diagnostics that happen during the
+ load.  Save the current table (and use it during the loading process
+ below).  */
+  class line_maps *save_line_table = line_table;
+
   /* Delete any deletable objects.  This makes ggc_pch_read much
  faster, as it can be sure that no GCable objects remain other
  than the ones just read in.  */
@@ -603,20 +614,40 @@ gt_pch_restore (FILE *f)
fatal_error (input_location, "cannot read PCH file: %m");
 
   /* Read in all the global pointers, in 6 easy loops.  */
+  bool error_reading_pointers = false;
   for (rt = gt_ggc_rtab; *rt; rt++)
 for (rti = *rt; rti->base != NULL; rti++)
   for (i = 0; i < rti->nelt; i++)
if (fread ((char *)rti->base + rti->stride * i,
   sizeof (void *), 1, f) != 1)
- fatal_error (input_location, "cannot read PCH file: %m");
+ error_reading_pointers = true;
+
+  /* Stash the newly read-in line table pointer - it does not point to
+ anything meaningful yet, so swap the old one back in.  */
+  class line_maps *new_line_table = line_table;
+  line_table = save_line_table;
+  if (error_reading_pointers)
+fatal_error (input_location, "cannot read PCH file: %m");
 
   if (fread (, sizeof (mmi), 1, f) != 1)
 fatal_error (input_location, "cannot read PCH file: %m");
 
   result = host_hooks.gt_pch_use_address (mmi.preferred_base, mmi.size,
  fileno (f), mmi.offset);
+
+  /* We could not mmap or otherwise allocate the required memory at the
+ address needed.  */
   if (result < 0)
-fatal_error (input_location, "had to relocate PCH");
+{
+  sorry_at (input_location, "PCH relocation is not yet supported");
+  /* There is no point in continuing from here, we will only end up
+with a crashed (most likely hanging) compiler.  */
+  exit (-1);
+}
+
+  /* (0) We allocated memory, but did not mmap the file, so we need to read
+ the data in manually.  (>0) Otherwise the mmap succeed for the address
+ we wanted.  */
   if (result == 0)
 {
   if (fseek (f, mmi.offset, SEEK_SET) != 0
@@ -629,6 +660,10 @@ gt_pch_restore (FILE *f)
   ggc_pch_read (f, mmi.preferred_base);
 
   gt_pch_restore_stringpool ();
+
+  /* Barring corruption of the PCH file, the restored line table should be
+ complete and usable.  */
+  line_table = new_line_table;
 }
 
 /* Default version of 

Re: [COMMITTED] path solver: Solve PHI imports first for ranges.

2021-11-13 Thread Aldy Hernandez via Gcc-patches
On Sat, Nov 13, 2021 at 10:41 AM Aldy Hernandez  wrote:
>
> On Sat, Nov 13, 2021 at 1:51 AM Andrew MacLeod  wrote:
> >
> > On 11/12/21 14:50, Richard Biener via Gcc-patches wrote:
> > > On November 12, 2021 8:46:25 PM GMT+01:00, Aldy Hernandez via Gcc-patches 
> > >  wrote:
> > >> PHIs must be resolved first while solving ranges in a block,
> > >> regardless of where they appear in the import bitmap.  We went through
> > >> a similar exercise for the relational code, but missed these.
> > > Must not all stmts be resolved in program order (for optimality at least)?
> >
> > Generally,Imports are live on entry values to a block, so their order is
> > not particularly important.. they are all simultaneous. PHIs are also
> > considered imports for data flow purposes, but they happen before the
> > first stmt, all simultaneously... they need to be distinguished because
> > phi arguments can refer to other phi defs which may be in this block
> > live around a back edge, and we need to be sure we get the right version.
> >
> > we should look closer to be sure this isn't an accidental fix that
> > leaves the root problem .   we need to be sure *all* the PHI arguments
> > are resolved from outside this block. whats the testcase?
>
> The testcase is the simpler testcase from the PR:
>
> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51776
>
> The gist is on a path coming in from BB13:
>
> # n_42 = PHI 
> # m_31 = PHI <0(13), m_16(4)>
>
> We were solving m_31 first and putting it in the cache, and then the
> calculation for n_42 picked up this cached m_31 incorrectly.
>
> With my patch we do the PHIs first, in whatever gphi_iterator order
> uses, which I assume is the order in the IL above.
>
> However, if PHIs must be resolved simultaneously, then perhaps we need
> to tweak this.  Suppose we flip the definitions:
>
> # m_31 = PHI <0(13), m_16(4)>
> # n_42 = PHI 
>
> I assume the definition of n_42 should pick up the incoming m_31(13),
> not one defined in the other PHI.  In which case, we could resolve all
> the PHIs first, but put them in the cache after we're done with all of
> them.

And lo and behold, a PR just came in exhibiting this exact behavior,
saving me from having to come up with a reduced testcase ;-).

The testcase in the PR has a path coming in from BB5:

# p3_7 = PHI <1(2), 0(5)>
# p2_17 = PHI <1(2), p3_7(5)>

We're picking up the p3_7 in the PHI when calculating p2_17.

Attached is the patch in testing.
From bbe7d177711cd930d2b66679482a6892d9bd4348 Mon Sep 17 00:00:00 2001
From: Aldy Hernandez 
Date: Sat, 13 Nov 2021 12:37:25 +0100
Subject: [PATCH] path solver: Compute all PHI ranges simultaneously.

PHIs must be resolved simulatenously, otherwise we may not pick up the
ranges incoming to the block.

For example.  If we put p3_7 in the cache before all PHIs have been
computed, we will pick up the wrong p3_7 value for p2_17:

# p3_7 = PHI <1(2), 0(5)>
# p2_17 = PHI <1(2), p3_7(5)>

This patch delays updating the cache until all PHIs have been
analyzed.

gcc/ChangeLog:

	PR tree-optimization/103222
	* gimple-range-path.cc (path_range_query::compute_ranges_in_phis):
	New.
	(path_range_query::compute_ranges_in_block): Call
	compute_ranges_in_phis.
	* gimple-range-path.h (path_range_query::compute_ranges_in_phis):
	New.

gcc/testsuite/ChangeLog:

	* gcc.dg/pr103222.c: New test.
---
 gcc/gimple-range-path.cc| 42 ++---
 gcc/gimple-range-path.h |  3 +++
 gcc/testsuite/gcc.dg/pr103222.c | 33 ++
 3 files changed, 69 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr103222.c

diff --git a/gcc/gimple-range-path.cc b/gcc/gimple-range-path.cc
index 32b2cb57597..9957ac9b6c7 100644
--- a/gcc/gimple-range-path.cc
+++ b/gcc/gimple-range-path.cc
@@ -343,6 +343,38 @@ path_range_query::range_defined_in_block (irange , tree name, basic_block bb)
   return true;
 }
 
+// Compute ranges defined in the PHIs in this block.
+
+void
+path_range_query::compute_ranges_in_phis (basic_block bb)
+{
+  int_range_max r;
+  gphi_iterator iter;
+
+  // PHIs must be resolved simultaneously on entry to the block
+  // because any dependencies must be satistifed with values on entry.
+  // Thus, we calculate all PHIs first, and then update the cache at
+  // the end.
+
+  m_tmp_phi_cache.clear ();
+  for (iter = gsi_start_phis (bb); !gsi_end_p (iter); gsi_next ())
+{
+  gphi *phi = iter.phi ();
+  tree name = gimple_phi_result (phi);
+
+  if (import_p (name) && range_defined_in_block (r, name, bb))
+	m_tmp_phi_cache.set_global_range (name, r);
+}
+  for (iter = gsi_start_phis (bb); !gsi_end_p (iter); gsi_next ())
+{
+  gphi *phi = iter.phi ();
+  tree name = gimple_phi_result (phi);
+
+  if (m_tmp_phi_cache.get_global_range (r, name))
+	set_cache (r, name);
+}
+}
+
 // Compute ranges defined in the current block, or exported to the
 // next block.
 
@@ -369,15 +401,7 @@ 

[committed] libstdc++: Implement std::spanstream for C++23

2021-11-13 Thread Jonathan Wakely via Gcc-patches
The tests are just the two small examples from the proposal, so more
tests are definitely needed. They can wait for stage 3 though. Tested
powerpc64le-linux, pushed to trunk.


This implements the  header, as proposed for C++23 by P0448R4.

libstdc++-v3/ChangeLog:

* include/Makefile.am: Add spanstream header.
* include/Makefile.in: Regenerate.
* include/precompiled/stdc++.h: Add spanstream header.
* include/std/version (__cpp_lib_spanstream): Define.
* include/std/spanstream: New file.
* testsuite/27_io/spanstream/1.cc: New test.
* testsuite/27_io/spanstream/version.cc: New test.
---
 libstdc++-v3/include/Makefile.am  |   1 +
 libstdc++-v3/include/Makefile.in  |   1 +
 libstdc++-v3/include/precompiled/stdc++.h |   6 +-
 libstdc++-v3/include/std/spanstream   | 446 ++
 libstdc++-v3/include/std/version  |   3 +
 libstdc++-v3/testsuite/27_io/spanstream/1.cc  |  53 +++
 .../testsuite/27_io/spanstream/version.cc |  10 +
 7 files changed, 519 insertions(+), 1 deletion(-)
 create mode 100644 libstdc++-v3/include/std/spanstream
 create mode 100644 libstdc++-v3/testsuite/27_io/spanstream/1.cc
 create mode 100644 libstdc++-v3/testsuite/27_io/spanstream/version.cc

diff --git a/libstdc++-v3/include/Makefile.am b/libstdc++-v3/include/Makefile.am
index 0e43f147591..25a8d9c8a41 100644
--- a/libstdc++-v3/include/Makefile.am
+++ b/libstdc++-v3/include/Makefile.am
@@ -76,6 +76,7 @@ std_headers = \
${std_srcdir}/shared_mutex \
${std_srcdir}/source_location \
${std_srcdir}/span \
+   ${std_srcdir}/spanstream \
${std_srcdir}/sstream \
${std_srcdir}/syncstream \
${std_srcdir}/stack \
diff --git a/libstdc++-v3/include/precompiled/stdc++.h 
b/libstdc++-v3/include/precompiled/stdc++.h
index d2601d7859d..e1c10e612e8 100644
--- a/libstdc++-v3/include/precompiled/stdc++.h
+++ b/libstdc++-v3/include/precompiled/stdc++.h
@@ -133,7 +133,7 @@
 #include 
 #endif
 
-#if __cplusplus > 201703L
+#if __cplusplus >= 202002L
 #include 
 #include 
 #include 
@@ -151,3 +151,7 @@
 #include 
 #include 
 #endif
+
+#if __cplusplus > 202002L
+#include 
+#endif
diff --git a/libstdc++-v3/include/std/spanstream 
b/libstdc++-v3/include/std/spanstream
new file mode 100644
index 000..240866ff26f
--- /dev/null
+++ b/libstdc++-v3/include/std/spanstream
@@ -0,0 +1,446 @@
+// Streams based on std::span -*- C++ -*-
+
+// Copyright The GNU Toolchain Authors.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+// .
+
+/** @file spanstream
+ *  This is a Standard C++ Library header.
+ */
+
+#ifndef _GLIBCXX_SPANSTREAM
+#define _GLIBCXX_SPANSTREAM 1
+
+#pragma GCC system_header
+
+#if __cplusplus > 202002L
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#if __cpp_lib_span
+namespace std _GLIBCXX_VISIBILITY(default)
+{
+_GLIBCXX_BEGIN_NAMESPACE_VERSION
+
+#define __cpp_lib_spanstream 202106L
+
+template>
+  class basic_spanbuf
+  : public basic_streambuf<_CharT, _Traits>
+  {
+using __streambuf_type = basic_streambuf<_CharT, _Traits>;
+
+  public:
+using char_type   = _CharT;
+using int_type= typename _Traits::int_type;
+using pos_type= typename _Traits::pos_type;
+using off_type= typename _Traits::off_type;
+using traits_type = _Traits;
+
+// [spanbuf.ctor], constructors
+basic_spanbuf() : basic_spanbuf(ios_base::in | ios_base::out)
+{ }
+
+explicit
+basic_spanbuf(ios_base::openmode __which)
+: __streambuf_type(), _M_mode(__which)
+{ }
+
+explicit
+basic_spanbuf(std::span<_CharT> __s,
+ ios_base::openmode __which = ios_base::in | ios_base::out)
+: __streambuf_type(), _M_mode(__which)
+{ span(__s); }
+
+basic_spanbuf(const basic_spanbuf&) = delete;
+
+/// Move constructor. In this implementation `rhs` is left unchanged.
+basic_spanbuf(basic_spanbuf&& __rhs)
+: __streambuf_type(__rhs), _M_mode(__rhs._M_mode)
+{ span(__rhs._M_buf); }
+
+  

Enable ipa-sra for functions with fnspec attribute

2021-11-13 Thread Jan Hubicka via Gcc-patches
Hi,
this patch enables some ipa-sra on fortran by allowing signature changes on 
functions
with "fn spec" attribute when ipa-modref is enabled.  This is possible since 
ipa-modref
knows how to preserve things we trace in fnspec and fnspec generated by fortran 
forntend
are quite simple and can be analysed automatically now.  To be sure I will also 
add
code that merge fnspec to parameters.

This unfortunately hits bug in ipa-param-manipulation when we remove parameter
that specifies size of variable length parameter. For this reason I added a hack
that prevent signature changes on such functions and will handle it 
incrementally.

I tried creating C testcase but it is blocked by another problem that we punt 
ipa-sra
on access attribute.  This is optimization regression we ought to fix so I 
filled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103223.

As a followup I will add code classifying the type attributes (we have just 
few) and 
get stats on access attribute.

Martin, can you please check that the code detecting signature changes is 
correct
and can't be done more easily?

Bootstrapped/regtested x86_64-linux, comitted.
Honza

gcc/ChangeLog:

* ipa-fnsummary.c (compute_fn_summary): Do not give up on signature
changes on "fn spec" attribute; give up on varadic types.
* ipa-param-manipulation.c: Include attribs.h.
(build_adjusted_function_type): New parameter ARG_MODIFIED; if it is
true remove "fn spec" attribute.
(ipa_param_adjustments::build_new_function_type): Update.
(ipa_param_body_adjustments::modify_formal_parameters): update.
* ipa-sra.c: Include attribs.h.
(ipa_sra_preliminary_function_checks): Do not check for TYPE_ATTRIBUTES.

diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
index 2cfa9a6d0e9..94a80d3ec90 100644
--- a/gcc/ipa-fnsummary.c
+++ b/gcc/ipa-fnsummary.c
@@ -3135,10 +3135,38 @@ compute_fn_summary (struct cgraph_node *node, bool 
early)
else
 info->inlinable = tree_inlinable_function_p (node->decl);
 
-   /* Type attributes can use parameter indices to describe them.  */
-   if (TYPE_ATTRIBUTES (TREE_TYPE (node->decl))
-  /* Likewise for #pragma omp declare simd functions or functions
- with simd attribute.  */
+   bool no_signature = false;
+   /* Type attributes can use parameter indices to describe them.
+ Special case fn spec since we can safely preserve them in
+ modref summaries.  */
+   for (tree list = TYPE_ATTRIBUTES (TREE_TYPE (node->decl));
+   list && !no_signature; list = TREE_CHAIN (list))
+if (!flag_ipa_modref
+|| !is_attribute_p ("fn spec", get_attribute_name (list)))
+  {
+if (dump_file)
+   {
+ fprintf (dump_file, "No signature change:"
+  " function type has unhandled attribute %s.\n",
+  IDENTIFIER_POINTER (get_attribute_name (list)));
+   }
+no_signature = true;
+  }
+   for (tree parm = DECL_ARGUMENTS (node->decl);
+   parm && !no_signature; parm = DECL_CHAIN (parm))
+if (variably_modified_type_p (TREE_TYPE (parm), node->decl))
+  {
+if (dump_file)
+   {
+ fprintf (dump_file, "No signature change:"
+  " has parameter with variably modified type.\n");
+   }
+no_signature = true;
+  }
+
+   /* Likewise for #pragma omp declare simd functions or functions
+ with simd attribute.  */
+   if (no_signature
   || lookup_attribute ("omp declare simd",
DECL_ATTRIBUTES (node->decl)))
 node->can_change_signature = false;
diff --git a/gcc/ipa-param-manipulation.c b/gcc/ipa-param-manipulation.c
index ae3149718ca..20f41dd5363 100644
--- a/gcc/ipa-param-manipulation.c
+++ b/gcc/ipa-param-manipulation.c
@@ -45,6 +45,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "symtab-clones.h"
 #include "tree-phinodes.h"
 #include "cfgexpand.h"
+#include "attribs.h"
 
 
 /* Actual prefixes of different newly synthetized parameters.  Keep in sync
@@ -281,11 +282,13 @@ fill_vector_of_new_param_types (vec *new_types, 
vec *otypes,
 /* Build and return a function type just like ORIG_TYPE but with parameter
types given in NEW_PARAM_TYPES - which can be NULL if, but only if,
ORIG_TYPE itself has NULL TREE_ARG_TYPEs.  If METHOD2FUNC is true, also make
-   it a FUNCTION_TYPE instead of FUNCTION_TYPE.  */
+   it a FUNCTION_TYPE instead of FUNCTION_TYPE.
+   If ARG_MODIFIED is true drop attributes that are no longer up to date.  */
 
 static tree
 build_adjusted_function_type (tree orig_type, vec *new_param_types,
- bool method2func, bool skip_return)
+ bool method2func, bool skip_return,
+ bool 

[COMMITTED] path solver: Merge path_range_query constructors.

2021-11-13 Thread Aldy Hernandez via Gcc-patches
There's no need for two constructors, when we can do it all with one
that defaults to the common behavior:

path_range_query (bool resolve = true, gimple_ranger *ranger = NULL);

Tested on x86-64 Linux.

gcc/ChangeLog:

* gimple-range-path.cc (path_range_query::path_range_query): Merge
ctors.
(path_range_query::import_p): Move from header file.
(path_range_query::~path_range_query): Adjust for combined ctors.
* gimple-range-path.h: Merge ctors.
(path_range_query::import_p): Move to .cc file.
---
 gcc/gimple-range-path.cc | 31 +--
 gcc/gimple-range-path.h  | 17 -
 2 files changed, 21 insertions(+), 27 deletions(-)

diff --git a/gcc/gimple-range-path.cc b/gcc/gimple-range-path.cc
index 71b290434cb..32b2cb57597 100644
--- a/gcc/gimple-range-path.cc
+++ b/gcc/gimple-range-path.cc
@@ -36,33 +36,36 @@ along with GCC; see the file COPYING3.  If not see
 // Internal construct to help facilitate debugging of solver.
 #define DEBUG_SOLVER (dump_file && (param_threader_debug == 
THREADER_DEBUG_ALL))
 
-path_range_query::path_range_query (gimple_ranger *ranger, bool resolve)
+path_range_query::path_range_query (bool resolve, gimple_ranger *ranger)
   : m_cache (new ssa_global_cache),
 m_has_cache_entry (BITMAP_ALLOC (NULL)),
-m_ranger (ranger),
 m_resolve (resolve),
-m_alloced_ranger (false)
+m_alloced_ranger (!ranger)
 {
-  m_oracle = new path_oracle (ranger->oracle ());
-}
+  if (m_alloced_ranger)
+m_ranger = new gimple_ranger;
+  else
+m_ranger = ranger;
 
-path_range_query::path_range_query (bool resolve)
-  : m_cache (new ssa_global_cache),
-m_has_cache_entry (BITMAP_ALLOC (NULL)),
-m_ranger (new gimple_ranger),
-m_resolve (resolve),
-m_alloced_ranger (true)
-{
   m_oracle = new path_oracle (m_ranger->oracle ());
 }
 
 path_range_query::~path_range_query ()
 {
-  BITMAP_FREE (m_has_cache_entry);
-  delete m_cache;
   delete m_oracle;
   if (m_alloced_ranger)
 delete m_ranger;
+  BITMAP_FREE (m_has_cache_entry);
+  delete m_cache;
+}
+
+// Return TRUE if NAME is in the import bitmap.
+
+bool
+path_range_query::import_p (tree name)
+{
+  return (TREE_CODE (name) == SSA_NAME
+ && bitmap_bit_p (m_imports, SSA_NAME_VERSION (name)));
 }
 
 // Mark cache entry for NAME as unused.
diff --git a/gcc/gimple-range-path.h b/gcc/gimple-range-path.h
index ea4864d35ef..f8b2b04e57c 100644
--- a/gcc/gimple-range-path.h
+++ b/gcc/gimple-range-path.h
@@ -32,10 +32,10 @@ along with GCC; see the file COPYING3.  If not see
 class path_range_query : public range_query
 {
 public:
-  path_range_query (class gimple_ranger *ranger, bool resolve = true);
-  path_range_query (bool resolve = true);
+  path_range_query (bool resolve = true, class gimple_ranger *ranger = NULL);
   virtual ~path_range_query ();
-  void compute_ranges (const vec &, const bitmap_head *imports = 
NULL);
+  void compute_ranges (const vec &,
+  const bitmap_head *imports = NULL);
   void compute_ranges (edge e);
   void compute_imports (bitmap imports, basic_block exit);
   bool range_of_expr (irange , tree name, gimple * = NULL) override;
@@ -64,7 +64,7 @@ private:
   void compute_phi_relations (basic_block bb, basic_block prev);
   void maybe_register_phi_relation (gphi *, tree arg);
   bool add_to_imports (tree name, bitmap imports);
-  inline bool import_p (tree name);
+  bool import_p (tree name);
 
   // Path navigation.
   void set_path (const vec &);
@@ -104,13 +104,4 @@ private:
   bool m_alloced_ranger;
 };
 
-// Return TRUE if NAME is in the import bitmap.
-
-bool
-path_range_query::import_p (tree name)
-{
-  return (TREE_CODE (name) == SSA_NAME
- && bitmap_bit_p (m_imports, SSA_NAME_VERSION (name)));
-}
-
 #endif // GCC_TREE_SSA_THREADSOLVER_H
-- 
2.31.1



Re: [PATCH] Combine malloc + memset to calloc

2021-11-13 Thread Prathamesh Kulkarni via Gcc-patches
On Sat, 13 Nov 2021 at 02:00, Seija K. via Gcc-patches
 wrote:
>
> diff --git a/gcc/ada/terminals.c b/gcc/ada/terminals.c
> index a2dd4895d48..25d9acda752 100644
> --- a/gcc/ada/terminals.c
> +++ b/gcc/ada/terminals.c
> @@ -609,8 +609,7 @@ __gnat_setup_communication (struct TTY_Process**
> process_out) /* output param */
>  {
>struct TTY_Process* process;
>
> -  process = (struct TTY_Process*)malloc (sizeof (struct TTY_Process));
> -  ZeroMemory (process, sizeof (struct TTY_Process));
> +  process = (struct TTY_Process*)calloc (1, sizeof (struct TTY_Process));
>*process_out = process;
>
>return 0;
> diff --git a/gcc/config/rs6000/rs6000-gen-builtins.c
> b/gcc/config/rs6000/rs6000-gen-builtins.c
> index 1655a2fd765..2c895a2d9a9 100644
> --- a/gcc/config/rs6000/rs6000-gen-builtins.c
> +++ b/gcc/config/rs6000/rs6000-gen-builtins.c
> @@ -1307,8 +1307,7 @@ parse_args (prototype *protoptr)
>do {
>  consume_whitespace ();
>  int oldpos = pos;
> -typelist *argentry = (typelist *) malloc (sizeof (typelist));
> -memset (argentry, 0, sizeof *argentry);
> +typelist *argentry = (typelist *) calloc (1, sizeof (typelist));
Just wondering -- shouldn't this be xcalloc instead (and similarly in
other places) ?
>  typeinfo *argtype = >info;
>  success = match_type (argtype, VOID_NOTOK);
>  if (success)
> diff --git a/gcc/d/dmd/ctfeexpr.c b/gcc/d/dmd/ctfeexpr.c
> index a8e97833ad0..401ed748f43 100644
> --- a/gcc/d/dmd/ctfeexpr.c
> +++ b/gcc/d/dmd/ctfeexpr.c
> @@ -1350,8 +1350,7 @@ int ctfeRawCmp(Loc loc, Expression *e1, Expression
> *e2)
>  if (es2->keys->length != dim)
>  return 1;
>
> -bool *used = (bool *)mem.xmalloc(sizeof(bool) * dim);
> -memset(used, 0, sizeof(bool) * dim);
> +bool *used = (bool *)mem.xcalloc(dim, sizeof(bool));
>
>  for (size_t i = 0; i < dim; ++i)
>  {
> diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
> index 0cba95411a6..f5bff8b9441 100644
> --- a/gcc/internal-fn.c
> +++ b/gcc/internal-fn.c
> @@ -3081,9 +3081,16 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt)
>   0).exists ())
>   {
>unsigned HOST_WIDE_INT total_bytes = tree_to_uhwi (var_size);
> -  unsigned char *buf = (unsigned char *) xmalloc (total_bytes);
> -  memset (buf, (init_type == AUTO_INIT_PATTERN
> - ? INIT_PATTERN_VALUE : 0), total_bytes);
> +  unsigned char *buf;
> +if (init_type == AUTO_INIT_PATTERN)
> +  {
> +buf = (unsigned char *) xmalloc (total_bytes);
> +memset (buf, INIT_PATTERN_VALUE, total_bytes);
> +  }
> +else
> +  {
> +buf = (unsigned char *) xcalloc (1, total_bytes);
> +  }
Formatting nit for else -- no need for braces for single stmt.
In general, please run the patch thru contrib/check_GNU_style.py.
Leaving the actual review to maintainers.

Thanks,
Prathamesh
>tree itype = build_nonstandard_integer_type
>   (total_bytes * BITS_PER_UNIT, 1);
>wide_int w = wi::from_buffer (buf, total_bytes);
> diff --git a/libiberty/calloc.c b/libiberty/calloc.c
> index f4bd27b1cd2..1ef4156d28a 100644
> --- a/libiberty/calloc.c
> +++ b/libiberty/calloc.c
> @@ -17,7 +17,7 @@ Uses @code{malloc} to allocate storage for @var{nelem}
> objects of
>
>  /* For systems with larger pointers than ints, this must be declared.  */
>  PTR malloc (size_t);
> -void bzero (PTR, size_t);
> +void memset (PTR, int, size_t);
>
>  PTR
>  calloc (size_t nelem, size_t elsize)
> @@ -28,7 +28,7 @@ calloc (size_t nelem, size_t elsize)
>  nelem = elsize = 1;
>
>ptr = malloc (nelem * elsize);
> -  if (ptr) bzero (ptr, nelem * elsize);
> +  if (ptr) memset (ptr, 0, nelem * elsize);
>
>return ptr;
>  }
> diff --git a/libiberty/partition.c b/libiberty/partition.c
> index 81e5fc0f79a..75512d67258 100644
> --- a/libiberty/partition.c
> +++ b/libiberty/partition.c
> @@ -146,8 +146,7 @@ partition_print (partition part, FILE *fp)
>int e;
>
>/* Flag the elements we've already printed.  */
> -  done = (char *) xmalloc (num_elements);
> -  memset (done, 0, num_elements);
> +  done = (char *) xcalloc (num_elements, 1);
>
>/* A buffer used to sort elements in a class.  */
>class_elements = (int *) xmalloc (num_elements * sizeof (int));
> diff --git a/libobjc/gc.c b/libobjc/gc.c
> index 57895e61930..95a75f5cb2e 100644
> --- a/libobjc/gc.c
> +++ b/libobjc/gc.c
> @@ -307,10 +307,9 @@ __objc_generate_gc_type_description (Class class)
>   / sizeof (void *));
>size = ROUND (bits_no, BITS_PER_WORD) / BITS_PER_WORD;
>mask = objc_atomic_malloc (size * sizeof (int));
> -  memset (mask, 0, size * sizeof (int));
>
>class_structure_type = objc_atomic_malloc (type_size);
> -  *class_structure_type = current = 0;
> +  current = 0;
>__objc_class_structure_encoding (class, _structure_type,
> _size, );
>if (current + 1 == type_size)


[PATCH 4/4] Add aarch64-darwin support for off-stack trampolines

2021-11-13 Thread Maxim Blinov
Note: This patch is not yet ready for trunk as its dependent on some
patches that are not-yet-upstream, however it serves as motivation for
the previous patch(es) which are independent.



Implement the __builtin_nested_func_ptr_{created,deleted} functions
for the aarch64-darwin platform. For this platform
--enable-off-stack-trampolines is enabled by default, and
-foff-stack-trampolines is enabled by default if the host MacOS
operating system version is 11.x or greater.

Co-authored-by: Andrew Burgess 

libgcc/ChangeLog:

* config/aarch64/heap-trampoline.c (allocate_trampoline_page):
Request for MAP_JIT in the case of __APPLE__.
Provide __APPLE__ variant of aarch64_trampoline_insns that uses
x16 as the chain pointer.
(__builtin_nested_func_ptr_created): Call
pthread_jit_write_protect_np() to toggle read/write permission on
page.
* config.host (aarch64*-*darwin* | arm64*-*darwin*): Handle
--enable-off-stack-trampolines.
* configure.ac (--enable-off-stack-trampolines): Permit setting
for target aarch64*-*darwin* | arm64*-*darwin*, and set default to
enabled.
* configure: Regenerate.
---
 gcc/config.gcc  |  7 +
 libgcc/config.host  |  4 +++
 libgcc/config/aarch64/heap-trampoline.c | 36 +
 libgcc/configure|  6 +
 libgcc/configure.ac |  6 +
 5 files changed, 59 insertions(+)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 031be563c5d..c13f7629d44 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -1072,6 +1072,13 @@ esac
 
 # Figure out if we need to enable -foff-stack-trampolines by default.
 case ${target} in
+aarch64*-*darwin* | arm64*-*darwin*)
+  if test ${macos_maj} = 11 || test ${macos_maj} = 12; then
+tm_defines="$tm_defines OFF_STACK_TRAMPOLINES_INIT=1"
+  else
+tm_defines="$tm_defines OFF_STACK_TRAMPOLINES_INIT=0"
+  fi
+  ;;
 *)
   tm_defines="$tm_defines OFF_STACK_TRAMPOLINES_INIT=0"
   ;;
diff --git a/libgcc/config.host b/libgcc/config.host
index d1a491d27e7..3c536b0928a 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -414,6 +414,10 @@ aarch64*-*darwin* | arm64*-*darwin* )
tmake_file="${tmake_file} t-crtfm"
# No soft float for now because our long double is DF not TF.
md_unwind_header=aarch64/aarch64-unwind.h
+   if test x$off_stack_trampolines = xyes; then
+   extra_parts="$extra_parts heap-trampoline.o"
+   tmake_file="${tmake_file} ${cpu_type}/t-heap-trampoline"
+   fi
;;
 aarch64*-*-freebsd*)
extra_parts="$extra_parts crtfastmath.o"
diff --git a/libgcc/config/aarch64/heap-trampoline.c 
b/libgcc/config/aarch64/heap-trampoline.c
index 721a2bed400..6994602beaf 100644
--- a/libgcc/config/aarch64/heap-trampoline.c
+++ b/libgcc/config/aarch64/heap-trampoline.c
@@ -5,6 +5,9 @@
 #include 
 #include 
 
+/* For pthread_jit_write_protect_np */
+#include 
+
 void *allocate_trampoline_page (void);
 int get_trampolines_per_page (void);
 struct tramp_ctrl_data *allocate_tramp_ctrl (struct tramp_ctrl_data *parent);
@@ -43,8 +46,15 @@ allocate_trampoline_page (void)
 {
   void *page;
 
+#if defined(__gnu_linux__)
   page = mmap (0, getpagesize (), PROT_WRITE | PROT_EXEC,
   MAP_ANON | MAP_PRIVATE, 0, 0);
+#elif defined(__APPLE__)
+  page = mmap (0, getpagesize (), PROT_WRITE | PROT_EXEC,
+  MAP_ANON | MAP_PRIVATE | MAP_JIT, 0, 0);
+#else
+  page = MAP_FAILED;
+#endif
 
   return page;
 }
@@ -67,6 +77,7 @@ allocate_tramp_ctrl (struct tramp_ctrl_data *parent)
   return p;
 }
 
+#if defined(__gnu_linux__)
 static const uint32_t aarch64_trampoline_insns[] = {
   0xd503245f, /* hint34 */
   0x58b1, /* ldr x17, .+20 */
@@ -76,6 +87,20 @@ static const uint32_t aarch64_trampoline_insns[] = {
   0xd5033fdf /* isb */
 };
 
+#elif defined(__APPLE__)
+static const uint32_t aarch64_trampoline_insns[] = {
+  0xd503245f, /* hint34 */
+  0x58b1, /* ldr x17, .+20 */
+  0x58d0, /* ldr x16, .+24 */
+  0xd61f0220, /* br  x17 */
+  0xd5033f9f, /* dsb sy */
+  0xd5033fdf /* isb */
+};
+
+#else
+#error "Unsupported AArch64 platform for heap trampolines"
+#endif
+
 void
 __builtin_nested_func_ptr_created (void *chain, void *func, void **dst)
 {
@@ -99,11 +124,22 @@ __builtin_nested_func_ptr_created (void *chain, void 
*func, void **dst)
 = _ctrl_curr->trampolines[get_trampolines_per_page ()
- tramp_ctrl_curr->free_trampolines];
 
+#if defined(__APPLE__)
+  /* Disable write protection for the MAP_JIT regions in this thread (see
+ 
https://developer.apple.com/documentation/apple-silicon/porting-just-in-time-compilers-to-apple-silicon)
 */
+  pthread_jit_write_protect_np (0);
+#endif
+
   memcpy (trampoline->insns, aarch64_trampoline_insns,
  sizeof(aarch64_trampoline_insns));
   trampoline->func_ptr = func;

[PATCH 1/4] Generate off-stack nested function trampolines

2021-11-13 Thread Maxim Blinov
Add support for allocating nested function trampolines on an
executable heap rather than on the stack. This is motivated by targets
such as AArch64 Darwin, which globally prohibit executing code on the
stack.

The target-specific routines for allocating and writing trampolines is
to be provided in libgcc, and is by-default _not_ compiled in unless
the target specifically requires it, or you manually provide
--enable-off-stack-trampolines when configuring gcc/libgcc.

The gcc flag -foff-stack-trampolines controls whether to generate code
that instantiates trampolines on the stack, or to emit calls to
__builtin_nested_func_ptr_created and
__builtin_nested_func_ptr_deleted. Note that this flag is completely
independent of libgcc: If libgcc is for any reason missing those
symbols, you will get a link failure.

This implementation imposes some implicit restrictions as compared to
stack trampolines. longjmp'ing back to a state before a trampoline was
created will cause us to skip over the corresponding
__builtin_nested_func_ptr_deleted, which will leak trampolines
starting from the beginning of the linked list of allocated
trampolines. There may be scope for instrumenting longjmp/setjmp to
trigger cleanups of trampolines.

Co-authored-by: Andrew Burgess 

gcc/ChangeLog:

* builtins.def (BUILT_IN_NESTED_PTR_CREATED): Define.
(BUILT_IN_NESTED_PTR_DELETED): Ditto.
* common.opt (foff-stack-trampolines): Add flag to control
generation of heap-based trampoline instantiation.
* tree-nested.c (convert_tramp_reference_op): Don't bother calling
__builtin_adjust_trampoline for the off-stack case.
(finalize_nesting_tree_1): Emit calls to
__builtin_nested_...{created,deleted} if we're generating with
-foff-stack-trampolines.
* tree.c (build_common_builtin_nodes): Build
__builtin_nested_...{created,deleted}.
* dov/invoke.texi (-foff-stack-trampolines): Document.

libgcc/ChangeLog:

* configure.ac: Add configure parameter
--enable-off-stack-trampolines, and do error checking if we've
trying to enable off-stack trampolines for a platform that doesn't
provide any such implementation.
* configure: Regenerate.
* libgcc-std.ver.in: Ditto.
* libgcc2.h (__builtin_nested_func_ptr_created): Declare.
(__builtin_nested_func_ptr_deleted): Ditto.
---
 gcc/builtins.def |   2 +
 gcc/common.opt   |   4 ++
 gcc/config.gcc   |   7 +++
 gcc/doc/invoke.texi  |  14 +
 gcc/tree-nested.c| 121 +--
 gcc/tree.c   |  17 ++
 libgcc/configure |  26 +
 libgcc/configure.ac  |  17 ++
 libgcc/libgcc-std.ver.in |   3 +
 libgcc/libgcc2.h |   3 +
 10 files changed, 197 insertions(+), 17 deletions(-)

diff --git a/gcc/builtins.def b/gcc/builtins.def
index 45a09b4d42d..90a94a6dd0f 100644
--- a/gcc/builtins.def
+++ b/gcc/builtins.def
@@ -950,6 +950,8 @@ DEF_BUILTIN_STUB (BUILT_IN_ADJUST_TRAMPOLINE, 
"__builtin_adjust_trampoline")
 DEF_BUILTIN_STUB (BUILT_IN_INIT_DESCRIPTOR, "__builtin_init_descriptor")
 DEF_BUILTIN_STUB (BUILT_IN_ADJUST_DESCRIPTOR, "__builtin_adjust_descriptor")
 DEF_BUILTIN_STUB (BUILT_IN_NONLOCAL_GOTO, "__builtin_nonlocal_goto")
+DEF_BUILTIN_STUB (BUILT_IN_NESTED_PTR_CREATED, 
"__builtin_nested_func_ptr_created")
+DEF_BUILTIN_STUB (BUILT_IN_NESTED_PTR_DELETED, 
"__builtin_nested_func_ptr_deleted")
 
 /* Implementing __builtin_setjmp.  */
 DEF_BUILTIN_STUB (BUILT_IN_SETJMP_SETUP, "__builtin_setjmp_setup")
diff --git a/gcc/common.opt b/gcc/common.opt
index de9b848eda5..a97aeeb2165 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2149,6 +2149,10 @@ foffload-abi=
 Common Joined RejectNegative Enum(offload_abi)
 -foffload-abi=[lp64|ilp32] Set the ABI to use in an offload compiler.
 
+foff-stack-trampolines
+Common RejectNegative Var(flag_off_stack_trampolines) 
Init(OFF_STACK_TRAMPOLINES_INIT)
+Generate trampolines in executable memory rather than executable stack.
+
 Enum
 Name(offload_abi) Type(enum offload_abi) UnknownError(unknown offload ABI %qs)
 
diff --git a/gcc/config.gcc b/gcc/config.gcc
index edd12655c4a..c479aa4cc44 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -1070,6 +1070,13 @@ case ${target} in
   ;;
 esac
 
+# Figure out if we need to enable -foff-stack-trampolines by default.
+case ${target} in
+*)
+  tm_defines="$tm_defines OFF_STACK_TRAMPOLINES_INIT=0"
+  ;;
+esac
+
 case ${target} in
 aarch64*-*-elf | aarch64*-*-fuchsia* | aarch64*-*-rtems*)
tm_file="${tm_file} dbxelf.h elfos.h newlib-stdint.h"
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 2aba4c70b44..a5db65f8721 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -660,6 +660,7 @@ Objective-C and Objective-C++ Dialects}.
 @gccoptlist{-fcall-saved-@var{reg}  -fcall-used-@var{reg} @gol
 -ffixed-@var{reg}  -fexceptions @gol
 -fnon-call-exceptions  

[PATCH 2/4] Add x86_64-linux support for off-stack trampolines

2021-11-13 Thread Maxim Blinov
Implement the __builtin_nested_func_ptr_{created,deleted} functions
for the x86_64-linux platform. This serves to exercise the
infrastructure added in libgcc (--enable-off-stack-trampolines) and
gcc (-foff-stack-trampolines) in supporting off-stack trampoline
generation, and is intended primarily for demonstration and debugging
purposes.

Co-authored-by: Andrew Burgess 

libgcc/ChangeLog:

* config/i386/heap-trampoline.c: New file: Implement off-stack
trampolines for x86_64.
* config/i386/t-heap-trampoline: Add rule to build
config/i386/heap-trampoline.c
* config.host (x86_64-*-linux*): Handle
--enable-off-stack-trampolines.
* configure.ac (--enable-off-stack-trampolines): Permit setting
for target x86_64-*-linux*.
* configure: Regenerate.
---
 libgcc/config.host   |   4 +
 libgcc/config/i386/heap-trampoline.c | 143 +++
 libgcc/config/i386/t-heap-trampoline |  21 
 libgcc/configure |   3 +
 libgcc/configure.ac  |   3 +
 5 files changed, 174 insertions(+)
 create mode 100644 libgcc/config/i386/heap-trampoline.c
 create mode 100644 libgcc/config/i386/t-heap-trampoline

diff --git a/libgcc/config.host b/libgcc/config.host
index 168535b1780..163cd4c4161 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -753,6 +753,10 @@ x86_64-*-linux*)
tmake_file="${tmake_file} i386/t-crtpc t-crtfm i386/t-crtstuff 
t-dfprules"
tm_file="${tm_file} i386/elf-lib.h"
md_unwind_header=i386/linux-unwind.h
+   if test x$off_stack_trampolines = xyes; then
+   extra_parts="${extra_parts} heap-trampoline.o"
+   tmake_file="${tmake_file} i386/t-heap-trampoline"
+   fi
;;
 x86_64-*-kfreebsd*-gnu)
extra_parts="$extra_parts crtprec32.o crtprec64.o crtprec80.o 
crtfastmath.o"
diff --git a/libgcc/config/i386/heap-trampoline.c 
b/libgcc/config/i386/heap-trampoline.c
new file mode 100644
index 000..6c202660c35
--- /dev/null
+++ b/libgcc/config/i386/heap-trampoline.c
@@ -0,0 +1,143 @@
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+void *allocate_trampoline_page (void);
+int get_trampolines_per_page (void);
+struct tramp_ctrl_data *allocate_tramp_ctrl (struct tramp_ctrl_data *parent);
+void *allocate_trampoline_page (void);
+
+void __builtin_nested_func_ptr_created (void *chain, void *func, void **dst);
+void __builtin_nested_func_ptr_deleted (void);
+
+struct tramp_ctrl_data;
+struct tramp_ctrl_data
+{
+  struct tramp_ctrl_data *prev;
+
+  int free_trampolines;
+
+  /* This will be pointing to an executable mmap'ed page.  */
+  union ix86_trampoline *trampolines;
+};
+
+static const uint8_t trampoline_insns[] = {
+  /* movabs $,%r11  */
+  0x49, 0xbb,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+
+  /* movabs $,%r10  */
+  0x49, 0xba,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+
+  /* rex.WB jmpq *%r11  */
+  0x41, 0xff, 0xe3
+};
+
+union ix86_trampoline {
+  uint8_t insns[sizeof(trampoline_insns)];
+
+  struct __attribute__((packed)) fields {
+uint8_t insn_0[2];
+void *func_ptr;
+uint8_t insn_1[2];
+void *chain_ptr;
+uint8_t insn_2[3];
+  } fields;
+};
+
+int
+get_trampolines_per_page (void)
+{
+  return getpagesize() / sizeof(union ix86_trampoline);
+}
+
+static _Thread_local struct tramp_ctrl_data *tramp_ctrl_curr = NULL;
+
+void *
+allocate_trampoline_page (void)
+{
+  void *page;
+
+  page = mmap (0, getpagesize (), PROT_WRITE | PROT_EXEC,
+  MAP_ANON | MAP_PRIVATE, 0, 0);
+
+  return page;
+}
+
+struct tramp_ctrl_data *
+allocate_tramp_ctrl (struct tramp_ctrl_data *parent)
+{
+  struct tramp_ctrl_data *p = malloc (sizeof (struct tramp_ctrl_data));
+  if (p == NULL)
+return NULL;
+
+  p->trampolines = allocate_trampoline_page ();
+
+  if (p->trampolines == MAP_FAILED)
+return NULL;
+
+  p->prev = parent;
+  p->free_trampolines = get_trampolines_per_page();
+
+  return p;
+}
+
+void
+__builtin_nested_func_ptr_created (void *chain, void *func, void **dst)
+{
+  if (tramp_ctrl_curr == NULL)
+{
+  tramp_ctrl_curr = allocate_tramp_ctrl (NULL);
+  if (tramp_ctrl_curr == NULL)
+   abort ();
+}
+
+  if (tramp_ctrl_curr->free_trampolines == 0)
+{
+  void *tramp_ctrl = allocate_tramp_ctrl (tramp_ctrl_curr);
+  if (!tramp_ctrl)
+   abort ();
+
+  tramp_ctrl_curr = tramp_ctrl;
+}
+
+  union ix86_trampoline *trampoline
+= _ctrl_curr->trampolines[get_trampolines_per_page ()
+   - tramp_ctrl_curr->free_trampolines];
+
+  memcpy (trampoline->insns, trampoline_insns,
+ sizeof(trampoline_insns));
+  trampoline->fields.func_ptr = func;
+  trampoline->fields.chain_ptr = chain;
+
+  tramp_ctrl_curr->free_trampolines -= 1;
+
+  __builtin___clear_cache ((void *)trampoline->insns,
+  ((void *)trampoline->insns + 
sizeof(trampoline->insns)));
+
+  

[PATCH 3/4] Add aarch64-linux support for off-stack trampolines

2021-11-13 Thread Maxim Blinov
Implement the __builtin_nested_func_ptr_{created,deleted} functions
for the aarch64-linux platform. This serves to exercise the
infrastructure added in libgcc (--enable-off-stack-trampolines) and
gcc (-foff-stack-trampolines) in supporting off-stack trampoline
generation, and is intended primarily for demonstration and debugging
purposes.

Co-authored-by: Andrew Burgess 

libgcc/ChangeLog:

* config/aarch64/heap-trampoline.c: New file: Implement off-stack
trampolines for aarch64.
* config/aarch64/t-heap-trampoline: Add rule to build
config/aarch64/heap-trampoline.c
* config.host (aarch64-*-linux*): Handle
--enable-off-stack-trampolines.
* configure.ac (--enable-off-stack-trampolines): Permit setting
for target aarch64-*-linux*.
* configure: Regenerate.
---
 libgcc/config.host  |   4 +
 libgcc/config/aarch64/heap-trampoline.c | 133 
 libgcc/config/aarch64/t-heap-trampoline |  21 
 libgcc/configure|   3 +
 libgcc/configure.ac |   3 +
 5 files changed, 164 insertions(+)
 create mode 100644 libgcc/config/aarch64/heap-trampoline.c
 create mode 100644 libgcc/config/aarch64/t-heap-trampoline

diff --git a/libgcc/config.host b/libgcc/config.host
index 163cd4c4161..912477db7d9 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -388,6 +388,10 @@ aarch64*-*-linux*)
tmake_file="${tmake_file} ${cpu_type}/t-aarch64"
tmake_file="${tmake_file} ${cpu_type}/t-lse t-slibgcc-libgcc"
tmake_file="${tmake_file} ${cpu_type}/t-softfp t-softfp t-crtfm"
+   if test x$off_stack_trampolines = xyes; then
+   extra_parts="$extra_parts heap-trampoline.o"
+   tmake_file="${tmake_file} ${cpu_type}/t-heap-trampoline"
+   fi
;;
 aarch64*-*-vxworks7*)
extra_parts="$extra_parts crtfastmath.o"
diff --git a/libgcc/config/aarch64/heap-trampoline.c 
b/libgcc/config/aarch64/heap-trampoline.c
new file mode 100644
index 000..721a2bed400
--- /dev/null
+++ b/libgcc/config/aarch64/heap-trampoline.c
@@ -0,0 +1,133 @@
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+void *allocate_trampoline_page (void);
+int get_trampolines_per_page (void);
+struct tramp_ctrl_data *allocate_tramp_ctrl (struct tramp_ctrl_data *parent);
+void *allocate_trampoline_page (void);
+
+void __builtin_nested_func_ptr_created (void *chain, void *func, void **dst);
+void __builtin_nested_func_ptr_deleted (void);
+
+struct tramp_ctrl_data;
+struct tramp_ctrl_data
+{
+  struct tramp_ctrl_data *prev;
+
+  int free_trampolines;
+
+  /* This will be pointing to an executable mmap'ed page.  */
+  struct aarch64_trampoline *trampolines;
+};
+
+struct aarch64_trampoline {
+  uint32_t insns[6];
+  void *func_ptr;
+  void *chain_ptr;
+};
+
+int
+get_trampolines_per_page (void)
+{
+  return getpagesize() / sizeof(struct aarch64_trampoline);
+}
+
+static _Thread_local struct tramp_ctrl_data *tramp_ctrl_curr = NULL;
+
+void *
+allocate_trampoline_page (void)
+{
+  void *page;
+
+  page = mmap (0, getpagesize (), PROT_WRITE | PROT_EXEC,
+  MAP_ANON | MAP_PRIVATE, 0, 0);
+
+  return page;
+}
+
+struct tramp_ctrl_data *
+allocate_tramp_ctrl (struct tramp_ctrl_data *parent)
+{
+  struct tramp_ctrl_data *p = malloc (sizeof (struct tramp_ctrl_data));
+  if (p == NULL)
+return NULL;
+
+  p->trampolines = allocate_trampoline_page ();
+
+  if (p->trampolines == MAP_FAILED)
+return NULL;
+
+  p->prev = parent;
+  p->free_trampolines = get_trampolines_per_page();
+
+  return p;
+}
+
+static const uint32_t aarch64_trampoline_insns[] = {
+  0xd503245f, /* hint34 */
+  0x58b1, /* ldr x17, .+20 */
+  0x58d2, /* ldr x18, .+24 */
+  0xd61f0220, /* br  x17 */
+  0xd5033f9f, /* dsb sy */
+  0xd5033fdf /* isb */
+};
+
+void
+__builtin_nested_func_ptr_created (void *chain, void *func, void **dst)
+{
+  if (tramp_ctrl_curr == NULL)
+{
+  tramp_ctrl_curr = allocate_tramp_ctrl (NULL);
+  if (tramp_ctrl_curr == NULL)
+   abort ();
+}
+
+  if (tramp_ctrl_curr->free_trampolines == 0)
+{
+  void *tramp_ctrl = allocate_tramp_ctrl (tramp_ctrl_curr);
+  if (!tramp_ctrl)
+   abort ();
+
+  tramp_ctrl_curr = tramp_ctrl;
+}
+
+  struct aarch64_trampoline *trampoline
+= _ctrl_curr->trampolines[get_trampolines_per_page ()
+   - tramp_ctrl_curr->free_trampolines];
+
+  memcpy (trampoline->insns, aarch64_trampoline_insns,
+ sizeof(aarch64_trampoline_insns));
+  trampoline->func_ptr = func;
+  trampoline->chain_ptr = chain;
+
+  tramp_ctrl_curr->free_trampolines -= 1;
+
+  __builtin___clear_cache ((void *)trampoline->insns,
+  ((void *)trampoline->insns + 
sizeof(trampoline->insns)));
+
+  *dst = >insns;
+}
+
+void
+__builtin_nested_func_ptr_deleted (void)
+{
+  if (tramp_ctrl_curr == NULL)
+abort ();
+
+  

[PATCH 2/2] Implement TARGET_..._CA target hooks for AArch64 Darwin

2021-11-13 Thread Maxim Blinov
Note: This patch is not yet ready for trunk as its dependent on some
patches that are not-yet-upstream, however it serves as motivation for
the previous patch(es) which are independent.



The AArch64 Darwin platform requires that named stack arguments are
passed naturally-aligned, while variadic stack arguments are passed on
word boundaries. Use the TARGET_FUNCTION_ARG_BOUNDARY_CA and
TARGET_FUNCTION_ARG_ROUND_BOUNDARY_CA target hooks to let the backend
correctly layout stack parameters.

gcc/ChangeLog:

* config.gcc: Enable -fstack-use-cumulative-args by default if the
host platform is MacOS 11.x or 12.x and we're on AArch64.

gcc/config/aarch64/ChangeLog:

* aarch64-protos.h (aarch64_init_cumulative_incoming_args):
Declare.
* aarch64.c (aarch64_init_cumulative_args): Initialize
`darwinpcs_n_named` (Total number of named parameters) and
`darwinpcs_n_args_processed` (Total number of parameters we
have processed, including variadic if any.)
(aarch64_init_cumulative_incoming_args): Implement the
INIT_CUMULATIVE_INCOMING_ARGS macro in order to capture
information on the number of named parameters for the current
function.
(aarch64_function_arg_advance): Increment
`darwinpcs_n_args_processed` each time we layout a function
parameter.
(aarch64_function_arg_boundary_ca): Implement
TARGET_FUNCTION_ARG_BOUNDARY_CA and
TARGET_FUNCTION_ARG_ROUND_BOUNDARY_CA to layout args based on
whether we're a named parameter or not.
(aarch64_function_arg_round_boundary_ca): Ditto.
(TARGET_FUNCTION_ARG_BOUNDARY_CA): Define.
(TARGET_FUNCTION_ARG_ROUND_BOUNDARY_CA): Ditto.
* aarch64.h (CUMULATIVE_ARGS): Add `darwinpcs_n_named` and
`darwinpcs_n_args_processed`.
(INIT_CUMULATIVE_INCOMING_ARGS): Define.
---
 gcc/config.gcc  |  7 
 gcc/config/aarch64/aarch64-protos.h |  1 +
 gcc/config/aarch64/aarch64.c| 56 +
 gcc/config/aarch64/aarch64.h|  5 +++
 4 files changed, 69 insertions(+)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index e12a9f042d0..626ba089c07 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -1072,6 +1072,13 @@ esac
 
 # Figure out if we need to enable -foff-stack-trampolines by default.
 case ${target} in
+aarch64*-*darwin* | arm64*-*darwin*)
+  if test ${macos_maj} = 11 || test ${macos_maj} = 12; then
+tm_defines="$tm_defines STACK_USE_CUMULATIVE_ARGS_INIT=1"
+  else
+tm_defines="$tm_defines STACK_USE_CUMULATIVE_ARGS_INIT=0"
+  fi
+  ;;
 *)
   tm_defines="$tm_defines STACK_USE_CUMULATIVE_ARGS_INIT=0"
   ;;
diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index a204647241e..cdc51fce906 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -896,6 +896,7 @@ void aarch64_expand_vector_init (rtx, rtx);
 void aarch64_sve_expand_vector_init (rtx, rtx);
 void aarch64_init_cumulative_args (CUMULATIVE_ARGS *, const_tree, rtx,
   const_tree, unsigned, bool = false);
+void aarch64_init_cumulative_incoming_args (CUMULATIVE_ARGS *, const_tree, 
rtx);
 void aarch64_init_expanders (void);
 void aarch64_init_simd_builtins (void);
 void aarch64_emit_call_insn (rtx);
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 38b3f1eab89..70c2336ab3a 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -7042,6 +7042,8 @@ aarch64_init_cumulative_args (CUMULATIVE_ARGS *pcum,
   pcum->darwinpcs_stack_bytes = 0;
   pcum->darwinpcs_sub_word_offset = 0;
   pcum->darwinpcs_sub_word_pos = 0;
+  pcum->darwinpcs_n_named = n_named;
+  pcum->darwinpcs_n_args_processed = 0;
   pcum->silent_p = silent_p;
   pcum->aapcs_vfp_rmode = VOIDmode;
 
@@ -7072,6 +7074,20 @@ aarch64_init_cumulative_args (CUMULATIVE_ARGS *pcum,
 }
 }
 
+void
+aarch64_init_cumulative_incoming_args (CUMULATIVE_ARGS *pcum,
+  const_tree fntype,
+  rtx libname ATTRIBUTE_UNUSED)
+{
+#if !TARGET_MACHO
+  INIT_CUMULATIVE_ARGS (*pcum, fntype, libname, current_function_decl, -1);
+#else
+  int n_named_args = (list_length (TYPE_ARG_TYPES (fntype)));
+
+  aarch64_init_cumulative_args (pcum, fntype, libname, current_function_decl, 
n_named_args);
+#endif
+}
+
 static void
 aarch64_function_arg_advance (cumulative_args_t pcum_v,
  const function_arg_info )
@@ -7092,6 +7108,7 @@ aarch64_function_arg_advance (cumulative_args_t pcum_v,
   pcum->aapcs_stack_size += pcum->aapcs_stack_words;
   pcum->aapcs_stack_words = 0;
   pcum->aapcs_reg = NULL_RTX;
+  pcum->darwinpcs_n_args_processed++;
 }
 }
 
@@ -7147,6 +7164,19 @@ aarch64_function_arg_boundary (machine_mode mode, 
const_tree type)
 #endif
 }
 
+static unsigned int

[PATCH 1/2] Add cumulative_args_t variants of TARGET_FUNCTION_ROUND_BOUNDARY and friends

2021-11-13 Thread Maxim Blinov
The two target hooks responsible for informing GCC about stack
parameter alignment are `TARGET_FUNCTION_ARG_BOUNDARY` and
`TARGET_FUNCTION_ARG_ROUND_BOUNDARY`, which currently only consider
the tree and machine_mode of a specific given argument.

Create two new target hooks suffixed with '_CA', and pass in a third
`cumulative_args_t` parameter. This enables the backend to make
alignment decisions based on the context of the whole function rather
than individual parameters.

The orignal machine_mode/tree type macros are not removed - they are
called by the default implementations of `TARGET_...BOUNDARY_CA` and
`TARGET_...ROUND_BOUNDARY_CA`. This is done with the intetnion of
avoiding large mechanical modifications of nearly every backend in
GCC. There is also a new flag, -fstack-use-cumulative-args, which
provides a way to completely bypass the new `..._CA` macros. This
feature is intended for debugging GCC itself.

gcc/ChangeLog:

* calls.c (initialize_argument_information): Pass `args_so_far`.
* common.opt: New flag `-fstack-use-cumulative-args`.
* config.gcc: No platforms currently use ..._CA-hooks: Set
-fstack-use-cumulative-args to be off by default.
* target.h (cumulative_args_t): Move declaration from here, to...
* cumulative-args.h (cumulative_args_t): ...this new file. This is
to permit backends to include the declaration of cumulative_args_t
without dragging in circular dependencies.
* function.c (assign_parm_find_entry_rtl): Provide
cumulative_args_t to locate_and_pad_parm.
(gimplify_parameters): Ditto.
(locate_and_pad_parm): Conditionally call new hooks if we're
invoked with -fstack-use-cumulative-args.
* function.h: Include cumulative-args.h.
(locate_and_pad_parm): Add cumulative_args_t parameter.
* target.def (function_arg_boundary_ca): Add.
(function_arg_round_boundary_ca): Ditto.
* targhooks.c (default_function_arg_boundary_ca): Implement.
(default_function_arg_round_boundary_ca): Ditto.
* targhooks.h (default_function_arg_boundary_ca): Declare.
(default_function_arg_round_boundary_ca): Ditto.
* doc/invoke.texi (-fstack-use-cumulative-args): Document.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in: Ditto.
---
 gcc/calls.c   |  3 +++
 gcc/common.opt|  4 
 gcc/config.gcc|  7 +++
 gcc/cumulative-args.h | 20 
 gcc/doc/invoke.texi   | 12 
 gcc/doc/tm.texi   | 20 
 gcc/doc/tm.texi.in|  4 
 gcc/function.c| 25 +
 gcc/function.h|  2 ++
 gcc/target.def| 24 
 gcc/target.h  | 17 +
 gcc/targhooks.c   | 16 
 gcc/targhooks.h   |  6 ++
 13 files changed, 140 insertions(+), 20 deletions(-)
 create mode 100644 gcc/cumulative-args.h

diff --git a/gcc/calls.c b/gcc/calls.c
index 27b59f26ad3..cef612a6ef4 100644
--- a/gcc/calls.c
+++ b/gcc/calls.c
@@ -1527,6 +1527,7 @@ initialize_argument_information (int num_actuals 
ATTRIBUTE_UNUSED,
 #endif
 reg_parm_stack_space,
 args[i].pass_on_stack ? 0 : args[i].partial,
+args_so_far,
 fndecl, args_size, [i].locate);
 #ifdef BLOCK_REG_PADDING
   else
@@ -4205,6 +4206,7 @@ emit_library_call_value_1 (int retval, rtx orgfun, rtx 
value,
   argvec[count].reg != 0,
 #endif
   reg_parm_stack_space, 0,
+  args_so_far,
   NULL_TREE, _size, [count].locate);
 
   if (argvec[count].reg == 0 || argvec[count].partial != 0
@@ -4296,6 +4298,7 @@ emit_library_call_value_1 (int retval, rtx orgfun, rtx 
value,
   argvec[count].reg != 0,
 #endif
   reg_parm_stack_space, argvec[count].partial,
+  args_so_far,
   NULL_TREE, _size, [count].locate);
  args_size.constant += argvec[count].locate.size.constant;
  gcc_assert (!argvec[count].locate.size.var);
diff --git a/gcc/common.opt b/gcc/common.opt
index de9b848eda5..982417c1e39 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2705,6 +2705,10 @@ fstack-usage
 Common RejectNegative Var(flag_stack_usage)
 Output stack usage information on a per-function basis.
 
+fstack-use-cumulative-args
+Common RejectNegative Var(flag_stack_use_cumulative_args) 
Init(STACK_USE_CUMULATIVE_ARGS_INIT)
+Use cumulative args-based stack layout hooks.
+
 fstrength-reduce
 Common Ignore
 Does nothing.  Preserved for backward compatibility.
diff --git a/gcc/config.gcc b/gcc/config.gcc
index edd12655c4a..046d691af56 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -1070,6 +1070,13 @@ case ${target} in
   ;;
 esac
 
+# Figure 

Re: [COMMITTED] path solver: Solve PHI imports first for ranges.

2021-11-13 Thread Aldy Hernandez via Gcc-patches
On Sat, Nov 13, 2021 at 1:51 AM Andrew MacLeod  wrote:
>
> On 11/12/21 14:50, Richard Biener via Gcc-patches wrote:
> > On November 12, 2021 8:46:25 PM GMT+01:00, Aldy Hernandez via Gcc-patches 
> >  wrote:
> >> PHIs must be resolved first while solving ranges in a block,
> >> regardless of where they appear in the import bitmap.  We went through
> >> a similar exercise for the relational code, but missed these.
> > Must not all stmts be resolved in program order (for optimality at least)?
>
> Generally,Imports are live on entry values to a block, so their order is
> not particularly important.. they are all simultaneous. PHIs are also
> considered imports for data flow purposes, but they happen before the
> first stmt, all simultaneously... they need to be distinguished because
> phi arguments can refer to other phi defs which may be in this block
> live around a back edge, and we need to be sure we get the right version.
>
> we should look closer to be sure this isn't an accidental fix that
> leaves the root problem .   we need to be sure *all* the PHI arguments
> are resolved from outside this block. whats the testcase?

The testcase is the simpler testcase from the PR:

https://gcc.gnu.org/bugzilla/attachment.cgi?id=51776

The gist is on a path coming in from BB13:

# n_42 = PHI 
# m_31 = PHI <0(13), m_16(4)>

We were solving m_31 first and putting it in the cache, and then the
calculation for n_42 picked up this cached m_31 incorrectly.

With my patch we do the PHIs first, in whatever gphi_iterator order
uses, which I assume is the order in the IL above.

However, if PHIs must be resolved simultaneously, then perhaps we need
to tweak this.  Suppose we flip the definitions:

# m_31 = PHI <0(13), m_16(4)>
# n_42 = PHI 

I assume the definition of n_42 should pick up the incoming m_31(13),
not one defined in the other PHI.  In which case, we could resolve all
the PHIs first, but put them in the cache after we're done with all of
them.

Thoughts?
Aldy