date:20201120

[PATCH] RISC-V: Always define MULTILIB_DEFAULTS

2020-11-20 Thread Kito Cheng

 - Define MULTILIB_DEFAULTS can reduce the total number of multilib if
   the default arch and ABI are listed in the multilib config.

 - This also simplify the implementation of --with-multilib-list.

gcc/ChangeLog:

* config.gcc (riscv*-*-*): Add TARGET_RISCV_DEFAULT_ABI and
TARGET_RISCV_DEFAULT_ARCH to tm_defines.
Remove including riscv/withmultilib.h for --with-multilib-list.
* config/riscv/riscv.h (STRINGIZING): New.
(__STRINGIZING): Ditto.
(MULTILIB_DEFAULTS): Ditto.
* config/riscv/withmultilib.h: Remove.
---
 gcc/config.gcc  | 39 ++---
 gcc/config/riscv/riscv.h|  9 ++
 gcc/config/riscv/withmultilib.h | 51 -
 3 files changed, 11 insertions(+), 88 deletions(-)
 delete mode 100644 gcc/config/riscv/withmultilib.h

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 0ae58482657..b12cf9b0f9d 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -4610,6 +4610,7 @@ case "${target}" in
exit 1
;;
esac
+   tm_defines="${tm_defines} 
TARGET_RISCV_DEFAULT_ARCH=${with_arch}"
 
# Make sure --with-abi is valid.  If it was not specified,
# pick a default based on the ISA, preferring soft-float
@@ -4631,6 +4632,7 @@ case "${target}" in
exit 1
;;
esac
+   tm_defines="${tm_defines} TARGET_RISCV_DEFAULT_ABI=${with_abi}"
 
# Make sure ABI and ISA are compatible.
case "${with_abi},${with_arch}" in
@@ -4673,7 +4675,6 @@ case "${target}" in
 
# Handle --with-multilib-list.
if test "x${with_multilib_list}" != xdefault; then
-   tm_file="${tm_file} riscv/withmultilib.h"
tmake_file="${tmake_file} riscv/t-withmultilib"
 
case ${with_multilib_list} in
@@ -4685,42 +4686,6 @@ case "${target}" in
echo 
"--with-multilib-list=${with_multilib_list} not supported."
exit 1
esac
-
-   # Define macros to select the default multilib.
-   case ${with_arch} in
-   rv32gc)
-   tm_defines="${tm_defines} TARGET_MLIB_ARCH=1"
-   ;;
-   rv64gc)
-   tm_defines="${tm_defines} TARGET_MLIB_ARCH=2"
-   ;;
-   *)
-   echo "unsupported --with-arch for 
--with-multilib-list"
-   exit 1
-   esac
-   case ${with_abi} in
-   ilp32)
-   tm_defines="${tm_defines} TARGET_MLIB_ABI=1"
-   ;;
-   ilp32f)
-   tm_defines="${tm_defines} TARGET_MLIB_ABI=2"
-   ;;
-   ilp32d)
-   tm_defines="${tm_defines} TARGET_MLIB_ABI=3"
-   ;;
-   lp64)
-   tm_defines="${tm_defines} TARGET_MLIB_ABI=4"
-   ;;
-   lp64f)
-   tm_defines="${tm_defines} TARGET_MLIB_ABI=5"
-   ;;
-   lp64d)
-   tm_defines="${tm_defines} TARGET_MLIB_ABI=6"
-   ;;
-   *)
-   echo "unsupported --with-abi for 
--with-multilib"
-   exit 1
-   esac
fi
;;
 
diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index e71fbf31279..df3003fbaa0 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -76,6 +76,15 @@ extern const char *riscv_default_mtune (int argc, const char 
**argv);
 #define ASM_MISA_SPEC ""
 #endif
 
+/* Reference:
+ https://gcc.gnu.org/onlinedocs/cpp/Stringizing.html#Stringizing  */
+#define STRINGIZING(s) __STRINGIZING(s)
+#define __STRINGIZING(s) #s
+
+#define MULTILIB_DEFAULTS \
+  {"march=" STRINGIZING (TARGET_RISCV_DEFAULT_ARCH), \
+   "mabi=" STRINGIZING (TARGET_RISCV_DEFAULT_ABI) }
+
 #undef ASM_SPEC
 #define ASM_SPEC "\
 %(subtarget_asm_debugging_spec) \
diff --git a/gcc/config/riscv/withmultilib.h b/gcc/config/riscv/withmultilib.h
deleted file mode 100644
index d022716d3b8..000
--- a/gcc/config/riscv/withmultilib.h
+++ /dev/null
@@ -1,51 +0,0 @@
-/* MULTILIB_DEFAULTS definitions for --with-multilib-list.
-   Copyright (C) 2018-2020 Free Software Foundation, Inc.
-
-   This file is part of GCC.
-
-   GCC is free software; you can redistribute it

Re: [PATCH] gcov: Add __gcov_info_to_gdca()

2020-11-20 Thread Martin Liška


On 11/17/20 10:57 AM, Sebastian Huber wrote:

This is a proposal to get the gcda data for a gcda info in a free-standing
environment.  It is intended to be used with the -fprofile-info-section option.
A crude test program which doesn't use a linker script is:


Hello.

I'm not pretty sure how this set up is going to work. Can you please explain me 
that?

I was thinking about your needs and I can imagine various techniques how to 
generate
gcda files format:

1) embedded system can override fopen, fwrite, fseek to a functions that do a 
remote
write-related functions

2) - use -fprofile-info-section
   - run an app on an embedded system and do a memory dump to a terminal/console
   - take the memory dump to a host system (with IO), run 
__gcov_init_from_memory_dump (...)
 and then do a normal __gcov_dump

What do you think about it?

Btw. I'm planning to commit in next stage1 removal of the internal I/O 
buffering.
Martin
>From 5a17015c096012b9e43a8dd45768a8d5fb3a3aee Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 18 Nov 2020 16:13:23 +0100
Subject: [PATCH] gcov: Use system IO buffering

gcc/ChangeLog:

	* gcov-io.c (gcov_write_block): Remove.
	(gcov_write_words): Likewise.
	(gcov_read_words): Re-implement using gcov_read_bytes.
	(gcov_allocate): Remove.
	(GCOV_BLOCK_SIZE): Likewise.
	(struct gcov_var): Remove most of the fields.
	(gcov_position): Implement with ftell.
	(gcov_rewrite): Remove setting of start and offset fields.
	(from_file): Re-format.
	(gcov_open): Remove setbuf call. It should not be needed.
	(gcov_close): Remove internal buffer handling.
	(gcov_magic): Use __builtin_bswap32.
	(gcov_write_counter): Use directly gcov_write_unsigned.
	(gcov_write_string): Use direct fwrite and do not round
	to 4 bytes.
	(gcov_seek): Use directly fseek.
	(gcov_write_tag): Use gcov_write_unsigned directly.
	(gcov_write_length): Likewise.
	(gcov_write_tag_length): Likewise.
	(gcov_read_bytes): Use directly fread.
	(gcov_read_unsigned): Use gcov_read_words.
	(gcov_read_counter): Likewise.
	(gcov_read_string): Use gcov_read_bytes.
	* gcov-io.h (GCOV_WORD_SIZE): Adjust to reflect
	that size is not in bytes, not words (4B).
	(GCOV_TAG_FUNCTION_LENGTH): Likewise.
	(GCOV_TAG_ARCS_LENGTH): Likewise.
	(GCOV_TAG_ARCS_NUM): Likewise.
	(GCOV_TAG_COUNTER_LENGTH): Likewise.
	(GCOV_TAG_COUNTER_NUM): Likewise.
	(GCOV_TAG_SUMMARY_LENGTH): Likewise.

libgcc/ChangeLog:

	* libgcov-driver.c: Fix GNU coding style.
---
 gcc/gcov-io.c   | 282 +---
 gcc/gcov-io.h   |  17 ++-
 libgcc/libgcov-driver.c |   2 +-
 3 files changed, 75 insertions(+), 226 deletions(-)

diff --git a/gcc/gcov-io.c b/gcc/gcov-io.c
index 4db56f8aacf..c3ca404f8b5 100644
--- a/gcc/gcov-io.c
+++ b/gcc/gcov-io.c
@@ -27,40 +27,14 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 /* Routines declared in gcov-io.h.  This file should be #included by
another source file, after having #included gcov-io.h.  */
 
-#if !IN_GCOV
-static void gcov_write_block (unsigned);
-static gcov_unsigned_t *gcov_write_words (unsigned);
-#endif
-static const gcov_unsigned_t *gcov_read_words (unsigned);
-#if !IN_LIBGCOV
-static void gcov_allocate (unsigned);
-#endif
-
-/* Optimum number of gcov_unsigned_t's read from or written to disk.  */
-#define GCOV_BLOCK_SIZE (1 << 10)
+static gcov_unsigned_t *gcov_read_words (void *buffer, unsigned);
 
 struct gcov_var
 {
   FILE *file;
-  gcov_position_t start;	/* Position of first byte of block */
-  unsigned offset;		/* Read/write position within the block.  */
-  unsigned length;		/* Read limit in the block.  */
-  unsigned overread;		/* Number of words overread.  */
   int error;			/* < 0 overflow, > 0 disk error.  */
-  int mode;	/* < 0 writing, > 0 reading */
+  int mode;			/* < 0 writing, > 0 reading */
   int endian;			/* Swap endianness.  */
-#if IN_LIBGCOV
-  /* Holds one block plus 4 bytes, thus all coverage reads & writes
- fit within this buffer and we always can transfer GCOV_BLOCK_SIZE
- to and from the disk. libgcov never backtracks and only writes 4
- or 8 byte objects.  */
-  gcov_unsigned_t buffer[GCOV_BLOCK_SIZE + 1];
-#else
-  /* Holds a variable length block, as the compiler can write
- strings and needs to backtrack.  */
-  size_t alloc;
-  gcov_unsigned_t *buffer;
-#endif
 } gcov_var;
 
 /* Save the current position in the gcov file.  */
@@ -71,8 +45,7 @@ static inline
 gcov_position_t
 gcov_position (void)
 {
-  gcov_nonruntime_assert (gcov_var.mode > 0); 
-  return gcov_var.start + gcov_var.offset;
+  return ftell (gcov_var.file);
 }
 
 /* Return nonzero if the error flag is set.  */
@@ -92,20 +65,16 @@ GCOV_LINKAGE inline void
 gcov_rewrite (void)
 {
   gcov_var.mode = -1; 
-  gcov_var.start = 0;
-  gcov_var.offset = 0;
   fseek (gcov_var.file, 0L, SEEK_SET);
 }
 #endif
 
-static inline gcov_unsigned_t from_file (gcov_unsigned_t value)
+static inline gcov_unsigned_t
+from_file (gcov_unsigned_t value)
 {

Re: Improve handling of memory operands in ipa-icf 4/4

2020-11-20 Thread Martin Liška


On 11/19/20 11:14 AM, Jan Hubicka wrote:

On 11/16/20 12:20 AM, Jan Hubicka wrote:

This is controlled by -fipa-icf-alias-sets

The patch drops too early, so we may end up processing function twice.  Also if
merging is not performed we lose code quality for no win (this is rare case).
My original plan was to remember the mismatched parameter and apply them only
after merging decisions are finished, but I was not sure how to do that in
ipa-icf.  In particular we need to ensure transitivity. In particular if
function foo is merged to bar, we also need to be sure that we dropped
base alias setsin functions tht are called by bar even if they themselves
are not merged. Martin, is there easy way to implement this on top of current 
ICF?


Well, you will need to create a set of merged functions and then traverse all
callers of these (via cgraph_node callers). It should not be so difficult, or?




Hey.


Well, imagine you have function A1 and A2
and calls A1->B2
and   A2->B3
and there is also B3.


You likely mean A1->B1 and A2->B2, right?



Now A1 is ICF equivalent to A2
and also B1,B2,B3 are ICF equivalent if some TBAA info is dropped.


ICF works in a way that we have classes of functions (groups) that we know
that are equivalent. And we we do, we subdivide these classes. Then when
e.g. foo1 and foo2 are known to be different then a class
with bar1->foo1 and bar2->foo2 is split as well.



ICF merges A2 to A1
it also considers to merge B2,B3 to B1 but concludes it is not benefical
at the very end (because some of them have address taken and
constructing wrapper is too expensive)


We never do a revert of a decision!



The comparsions done are
  B1:B2
  B1:B3
  A1:A2
So after comparing we have info what to drop in B1 to make merging B2->B1
and B3->B1 valid.  We also have info what to drop in B2 to make B1->B2
valid and in B3 to make B3->B1 valid.

But we meed to drop info in B2 to make B3->B2 valid to make call path
alias A2 of A1->B2 safe.


So what you need is to skip a division of groups based on the "strict aliasing"
and mark all functions that will need a drop operation.

Hope it helps?

Martin



Honza

Re: [PATCH v2] Add if-chain to switch conversion pass.

2020-11-20 Thread Martin Liška


On 11/19/20 3:46 PM, Richard Biener wrote:

OK, so can you send an updated patch?


Sure.

Martin
>From 76e8424bd54d15fb3b2a2bdb4179fa8773500381 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Fri, 28 Aug 2020 10:26:13 +0200
Subject: [PATCH] Add if-chain to switch conversion pass.

gcc/ChangeLog:

	PR tree-optimization/14799
	PR ipa/88702
	* Makefile.in: Add gimple-if-to-switch.o.
	* dbgcnt.def (DEBUG_COUNTER): Add new debug counter.
	* passes.def: Include new pass_if_to_switch pass.
	* timevar.def (TV_TREE_IF_TO_SWITCH): New timevar.
	* tree-pass.h (make_pass_if_to_switch): New.
	* tree-ssa-reassoc.c (struct operand_entry): Move to the header.
	(dump_range_entry): Move to header file.
	(debug_range_entry): Likewise.
	(no_side_effect_bb): Make it global.
	* tree-switch-conversion.h (simple_cluster::simple_cluster):
	Add inline for couple of functions in order to prevent error
	about multiple defined symbols.
	* gimple-if-to-switch.cc: New file.
	* tree-ssa-reassoc.h: New file.

gcc/testsuite/ChangeLog:

	PR tree-optimization/14799
	PR ipa/88702
	* gcc.dg/tree-ssa/pr96480.c: Disable if-to-switch conversion.
	* gcc.dg/tree-ssa/reassoc-32.c: Likewise.
	* g++.dg/tree-ssa/if-to-switch-1.C: New test.
	* gcc.dg/tree-ssa/if-to-switch-1.c: New test.
	* gcc.dg/tree-ssa/if-to-switch-2.c: New test.
	* gcc.dg/tree-ssa/if-to-switch-3.c: New test.
	* gcc.dg/tree-ssa/if-to-switch-4.c: New test.
	* gcc.dg/tree-ssa/if-to-switch-5.c: New test.
	* gcc.dg/tree-ssa/if-to-switch-6.c: New test.
	* gcc.dg/tree-ssa/if-to-switch-7.c: New test.
	* gcc.dg/tree-ssa/if-to-switch-8.c: New test.
---
 gcc/Makefile.in   |   1 +
 gcc/dbgcnt.def|   1 +
 gcc/gimple-if-to-switch.cc| 565 ++
 gcc/passes.def|   1 +
 .../g++.dg/tree-ssa/if-to-switch-1.C  |  25 +
 .../gcc.dg/tree-ssa/if-to-switch-1.c  |  35 ++
 .../gcc.dg/tree-ssa/if-to-switch-2.c  |  11 +
 .../gcc.dg/tree-ssa/if-to-switch-3.c  |  11 +
 .../gcc.dg/tree-ssa/if-to-switch-4.c  |  36 ++
 .../gcc.dg/tree-ssa/if-to-switch-5.c  |  12 +
 .../gcc.dg/tree-ssa/if-to-switch-6.c  |  42 ++
 .../gcc.dg/tree-ssa/if-to-switch-7.c  |  25 +
 .../gcc.dg/tree-ssa/if-to-switch-8.c  |  27 +
 gcc/testsuite/gcc.dg/tree-ssa/pr96480.c   |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/reassoc-32.c|   2 +-
 gcc/timevar.def   |   1 +
 gcc/tree-pass.h   |   1 +
 gcc/tree-ssa-reassoc.c|  27 +-
 gcc/tree-ssa-reassoc.h|  48 ++
 gcc/tree-switch-conversion.h  |  24 +-
 20 files changed, 865 insertions(+), 32 deletions(-)
 create mode 100644 gcc/gimple-if-to-switch.cc
 create mode 100644 gcc/testsuite/g++.dg/tree-ssa/if-to-switch-1.C
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-1.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-2.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-3.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-4.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-5.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-6.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-7.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-8.c
 create mode 100644 gcc/tree-ssa-reassoc.h

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 778ec09c75d..16be66fefc6 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1364,6 +1364,7 @@ OBJS = \
 	gimple-array-bounds.o \
 	gimple-builder.o \
 	gimple-expr.o \
+	gimple-if-to-switch.o \
 	gimple-iterator.o \
 	gimple-fold.o \
 	gimple-laddress.o \
diff --git a/gcc/dbgcnt.def b/gcc/dbgcnt.def
index a5b6bb66a6c..c0744b23f65 100644
--- a/gcc/dbgcnt.def
+++ b/gcc/dbgcnt.def
@@ -170,6 +170,7 @@ DEBUG_COUNTER (if_after_combine)
 DEBUG_COUNTER (if_after_reload)
 DEBUG_COUNTER (if_conversion)
 DEBUG_COUNTER (if_conversion_tree)
+DEBUG_COUNTER (if_to_switch)
 DEBUG_COUNTER (ipa_cp_bits)
 DEBUG_COUNTER (ipa_cp_values)
 DEBUG_COUNTER (ipa_cp_vr)
diff --git a/gcc/gimple-if-to-switch.cc b/gcc/gimple-if-to-switch.cc
new file mode 100644
index 000..25ef45175a7
--- /dev/null
+++ b/gcc/gimple-if-to-switch.cc
@@ -0,0 +1,565 @@
+/* If-elseif-else to switch conversion pass
+   Copyright (C) 2020 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General

Re: [PATCH] c++, v2: Add __builtin_clear_padding builtin - C++20 P0528R3 compiler side [PR88101]

2020-11-20 Thread Richard Biener

On Thu, 19 Nov 2020, Jakub Jelinek wrote:

> Hi!
> 
> This is the whole __builtin_clear_padding patchset merged into a single
> patch, + 2 new changes - one is that fold_builtin_1 now folds the
> 1 argument (meant for users) __builtin_clear_padding into an internal
> 2 argument form, where the second argument is NULL of the first argument's
> type, such that gimplifier's stripping of useless type conversions doesn't
> change behavior, and handling NULLPTR_TYPE as all padding bits, because
> lvalue-to-rvalue conversions with decltype(nullptr) type don't really read
> anything from the memory and so we need to clear all the bits as padding.
> Here is the full description:
> 
> The following patch implements __builtin_clear_padding builtin that clears
> the padding bits in object representation (but preserves value
> representation).  Inside of unions it clears only those padding bits that
> are padding for all the union members (so that it never alters value
> representation).
> 
> It handles trailing padding, padding in the middle of structs including
> bitfields (PDP11 unhandled, I've never figured out how those bitfields
> work), VLAs (doesn't handle variable length structures, but I think almost
> nobody uses them and it isn't worth the extra complexity).  For VLAs and
> sufficiently large arrays it uses runtime clearing loop instead of emitting
> straight-line code (unless arrays are inside of a union).
> 
> The way I think this can be used for atomics is e.g. if the structures
> are power of two sized and small enough that we use the hw atomics
> for say compare_exchange __builtin_clear_padding could be called first on
> the address of expected and desired arguments (for desired only if we want
> to ensure that most of the time the atomic memory will have padding bits
> cleared), then perform the weak cmpxchg and if that fails, we got the
> value from the atomic memory; we can call __builtin_clear_padding on a copy
> of that and then compare it with expected, and if it is the same with the
> padding bits masked off, we can use the original with whatever random
> padding bits in it as the new expected for next cmpxchg.
> __builtin_clear_padding itself is not atomic and therefore it shouldn't
> be called on the atomic memory itself, but compare_exchange*'s expected
> argument is a reference and normally the implementation may store there
> the current value from memory, so padding bits can be cleared in that,
> and desired is passed by value rather than reference, so clearing is fine
> too.
> 
> When using libatomic, we can use it either that way, or add new libatomic
> APIs that accept another argument, pointer to the padding bit bitmask,
> and construct that in the template as
>   alignas (_T) unsigned char _mask[sizeof (_T)];
>   std::memset (_mask, ~0, sizeof (_mask));
>   __builtin_clear_padding ((_T *) _mask);
> which will have bits cleared for padding bits and set for bits taking part
> in the value representation.  Then libatomic could internally instead
> of using memcmp compare
> for (i = 0; i < N; i++) if ((val1[i] & mask[i]) != (val2[i] & mask[i]))
> 
> Tested on x86_64-linux, ok for trunk if it passes full bootstrap/regtest?
> 
> 2020-11-19  Jakub Jelinek  
> 
>   PR libstdc++/88101
> gcc/
>   * builtins.def (BUILT_IN_CLEAR_PADDING): New built-in function.
>   * builtins.c (fold_builtin_1): Handle BUILT_IN_CLEAR_PADDING.
>   * gimple-fold.c (clear_padding_unit, clear_padding_buf_size): New
>   const variables.
>   (struct clear_padding_struct): New type.
>   (clear_padding_flush, clear_padding_add_padding,
>   clear_padding_emit_loop, clear_padding_type,
>   clear_padding_union, clear_padding_real_needs_padding_p,
>   clear_padding_type_may_have_padding_p,
>   gimple_fold_builtin_clear_padding): New functions.
>   (gimple_fold_builtin): Handle BUILT_IN_CLEAR_PADDING.
>   * doc/extend.texi (__builtin_clear_padding): Document.
> gcc/c-family/
>   * c-common.c (check_builtin_function_arguments): Handle
>   BUILT_IN_CLEAR_PADDING.
> gcc/testsuite/
>   * c-c++-common/builtin-clear-padding-1.c: New test.
>   * c-c++-common/torture/builtin-clear-padding-1.c: New test.
>   * c-c++-common/torture/builtin-clear-padding-2.c: New test.
>   * c-c++-common/torture/builtin-clear-padding-3.c: New test.
>   * c-c++-common/torture/builtin-clear-padding-4.c: New test.
>   * c-c++-common/torture/builtin-clear-padding-5.c: New test.
>   * g++.dg/torture/builtin-clear-padding-1.C: New test.
>   * g++.dg/torture/builtin-clear-padding-2.C: New test.
>   * gcc.dg/builtin-clear-padding-1.c: New test.
> 
> --- gcc/builtins.def.jj   2020-11-18 09:38:28.481816977 +0100
> +++ gcc/builtins.def  2020-11-19 16:15:50.573639579 +0100
> @@ -839,6 +839,7 @@ DEF_EXT_LIB_BUILTIN(BUILT_IN_CLEAR_C
>  /* [trans-mem]: Adjust BUILT_IN_TM_CALLOC if BUILT_IN_CALLOC is changed.  */
>  DEF_LIB_BUILTIN(BUILT_IN_CALLOC, "calloc",

Re: [PATCH] gcov: Add __gcov_info_to_gdca()

2020-11-20 Thread Sebastian Huber


On 20/11/2020 09:37, Martin Liška wrote:


On 11/17/20 10:57 AM, Sebastian Huber wrote:
This is a proposal to get the gcda data for a gcda info in a 
free-standing
environment.  It is intended to be used with the 
-fprofile-info-section option.

A crude test program which doesn't use a linker script is:


Hello.

I'm not pretty sure how this set up is going to work. Can you please 
explain me that?


I was thinking about your needs and I can imagine various techniques 
how to generate

gcda files format:

1) embedded system can override fopen, fwrite, fseek to a functions 
that do a remote

write-related functions
Yes, this is one option, however, the inhibit_libc disables quite a lot 
of libgcov functionality if Newlib is used for example.


2) - use -fprofile-info-section
   - run an app on an embedded system and do a memory dump to a 
terminal/console
   - take the memory dump to a host system (with IO), run 
__gcov_init_from_memory_dump (...)

 and then do a normal __gcov_dump


I am not sure if a plain memory dump really simplifies things. You have 
to get the filename separately since it is only referenced in gcov_info 
and not included in the structure:


struct gcov_info
{
[...]
  const char *filename;        /* output file name */
[...]
#ifndef IN_GCOV_TOOL
  const struct gcov_fn_info *const *functions; /* pointer to pointers
  to function 
information  */

[...]
#endif /* !IN_GCOV_TOOL */
};

Also the gcov_fn_info is not embedded in the gcov_info structure. If you 
do a plain memory dump, then you dump also pointers and how do you deal 
with these pointers on the host? You would need some extra information 
to describe the memory dump. So, why not use the gcda format for this? 
It is also more compact since zero value counters are skipped. Serial 
lines are slow, so less data to transfer is good.


/* Convert the gcov information to a gcda data stream.  The first 
callback is
   called exactly once with the filename associated with the gcov 
information.
   The filename may be NULL.  Afterwards, the second callback is 
subsequently

   called with chunks (the begin and length of the chunk are passed as the
   first two arguments) of the gcda data stream.  The fourth parameter is a
   user-provided argument passed as the last argument to the callback
   functions.  */

extern void __gcov_info_to_gcda (const struct gcov_info *gi_ptr,
                 void (*filename) (const char *name, void *arg),
                 void (*dump) (const void *begin, unsigned size, void 
*arg),


                 void *arg);

If __gcov_info_to_gcda() is correctly implemented, then this should give 
you directly gcda files if you use something like this:


#include 
#include 

extern const struct gcov_info *__gcov_info_start[];
extern const struct gcov_info *__gcov_info_end[];

static void
filename (const char *f, void *arg)
{
  FILE **file = arg;
  *file = fopen(f, "rb");
}

static void
dump (const void *d, unsigned n, void *arg)
{
  FILE **file = arg;
  fwrite(d, n, 1, *file);
}

static void
dump_gcov_info (void)
{
  const struct gcov_info **info = __gcov_info_start;
  const struct gcov_info **end = __gcov_info_end;

  /* Obfuscate variable to prevent compiler optimizations.  */
  __asm__ ("" : "+r" (end));

  while (info != end)
  {
    FILE *file = NULL;
    __gcov_info_to_gcda (*info, filename, dump, &file);
    fclose(file);
    ++info;
  }
}

int
main()
{
  dump_gcov_info();
  return 0;
}

The callback functions give the user the full control how the data of 
the gcda file is encoded for the transfer to a host. No gcov internals 
are exposed.


--
embedded brains GmbH
Sebastian HUBER
Dornierstr. 4
82178 Puchheim
Germany
email: sebastian.hu...@embedded-brains.de
Phone: +49-89-18 94 741 - 16
Fax:   +49-89-18 94 741 - 08
PGP: Public key available on request.

embedded brains GmbH
Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier: 
https://embedded-brains.de/datenschutzerklaerung/

[PATCH] i386: Optimize abs expansion [PR97873]

2020-11-20 Thread Uros Bizjak via Gcc-patches

The patch introduces absM named pattern to generate optimal insn sequence
for CMOVE_TARGET targets.  Currently, the expansion goes through neg+max
optabs, and the following code is generated:

movl%edi, %eax
negl%eax
cmpl%edi, %eax
cmovl   %edi, %eax

This sequence is unoptimal in two ways.  a) The compare instruction is
not needed, since NEG insn sets the sign flag based on the result.
The CMOV can use sign flag to select between negated and original value:

movl%edi, %eax
negl%eax
cmovs   %edi, %eax

b) On some targets, CMOV is undesirable due to its performance issues.
In addition to TARGET_EXPAND_ABS bypass, the patch introduces STV
conversion of abs RTX to use PABS SSE insn:

vmovd   %edi, %xmm0
vpabsd  %xmm0, %xmm0
vmovd   %xmm0, %eax

The patch changes compare mode of NEG instruction to CCGOCmode,
which is the same mode as the mode of SUB instruction. IOW, sign bit
becomes usable.

Also, the mode iterator of 3 pattern is changed
to SWI48x instead of SWI248. The purpose of maxmin expander is to
prepare max/min RTX for STV to eventually convert them to SSE PMAX/PMIN
instructions, in order to *avoid* CMOV insns with general registers.

2020-11-20  Uroš Bizjak  

gcc/
PR target/97873
* config/i386/i386.md (*neg2_2): Rename from
"*neg2_cmpz".  Use CCGOCmode instead of CCZmode.
(*negsi2_zext): Rename from *negsi2_cmpz_zext.
Use CCGOCmode instead of CCZmode.
(*neg_ccc_1): New insn pattern.
(*neg2_doubleword): Use *neg_ccc_1.

(abs2): Add FLAGS_REG clobber.
Use TARGET_CMOVE insn predicate.
(*abs2_1): New insn_and_split pattern.
(*absdi2_doubleword): Ditto.

(3): Use SWI48x mode iterator.
(*3): Use SWI48 mode iterator.

* config/i386/i386-features.c
(general_scalar_chain::compute_convert_gain): Handle ABS code.
(general_scalar_chain::convert_insn): Ditto.
(general_scalar_to_vector_candidate_p): Ditto.

gcc/testsuite/
PR target/97873
* gcc.target/i386/pr97873.c: New test.
* gcc.target/i386/pr97873-1.c: New test.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Pushed to mainline.

Uros.
diff --git a/gcc/config/i386/i386-features.c b/gcc/config/i386/i386-features.c
index 620f7f157f4..ff6676f54f7 100644
--- a/gcc/config/i386/i386-features.c
+++ b/gcc/config/i386/i386-features.c
@@ -581,7 +581,8 @@ general_scalar_chain::compute_convert_gain ()
   else if (GET_CODE (src) == NEG
   || GET_CODE (src) == NOT)
igain += m * ix86_cost->add - ix86_cost->sse_op - COSTS_N_INSNS (1);
-  else if (GET_CODE (src) == SMAX
+  else if (GET_CODE (src) == ABS
+  || GET_CODE (src) == SMAX
   || GET_CODE (src) == SMIN
   || GET_CODE (src) == UMAX
   || GET_CODE (src) == UMIN)
@@ -986,13 +987,6 @@ general_scalar_chain::convert_insn (rtx_insn *insn)
 
   switch (GET_CODE (src))
 {
-case ASHIFT:
-case ASHIFTRT:
-case LSHIFTRT:
-  convert_op (&XEXP (src, 0), insn);
-  PUT_MODE (src, vmode);
-  break;
-
 case PLUS:
 case MINUS:
 case IOR:
@@ -1002,8 +996,14 @@ general_scalar_chain::convert_insn (rtx_insn *insn)
 case SMIN:
 case UMAX:
 case UMIN:
-  convert_op (&XEXP (src, 0), insn);
   convert_op (&XEXP (src, 1), insn);
+  /* FALLTHRU */
+
+case ABS:
+case ASHIFT:
+case ASHIFTRT:
+case LSHIFTRT:
+  convert_op (&XEXP (src, 0), insn);
   PUT_MODE (src, vmode);
   break;
 
@@ -1414,6 +1414,12 @@ general_scalar_to_vector_candidate_p (rtx_insn *insn, 
enum machine_mode mode)
return false;
   break;
 
+case ABS:
+  if ((mode == DImode && !TARGET_AVX512VL)
+ || (mode == SImode && !TARGET_SSSE3))
+   return false;
+  break;
+
 case NEG:
 case NOT:
   break;
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 29935014772..13c995c1a02 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -10062,8 +10062,8 @@
   "#"
   "reload_completed"
   [(parallel
-[(set (reg:CCZ FLAGS_REG)
- (compare:CCZ (neg:DWIH (match_dup 1)) (const_int 0)))
+[(set (reg:CCC FLAGS_REG)
+ (ne:CCC (match_dup 1) (const_int 0)))
  (set (match_dup 0) (neg:DWIH (match_dup 1)))])
(parallel
 [(set (match_dup 2)
@@ -10096,36 +10096,46 @@
   [(set_attr "type" "negnot")
(set_attr "mode" "SI")])
 
-;; The problem with neg is that it does not perform (compare x 0),
-;; it really performs (compare 0 x), which leaves us with the zero
-;; flag being the only useful item.
-
-(define_insn "*neg2_cmpz"
-  [(set (reg:CCZ FLAGS_REG)
-   (compare:CCZ
+(define_insn "*neg2_2"
+  [(set (reg FLAGS_REG)
+   (compare
  (neg:SWI (match_operand:SWI 1 "nonimmediate_operand" "0"))
-  (const_int 0)))
+ (const_int 0)))
(set (match_operand:SWI 0 "nonimmediate_operand" "=m")
(neg:SWI (match_dup 1)))]
-  "ix86_unary_o

Re: [PATCH] gcov: Add __gcov_info_to_gdca()

2020-11-20 Thread Martin Liška


On 11/20/20 10:25 AM, Sebastian Huber wrote:

On 20/11/2020 09:37, Martin Liška wrote:


On 11/17/20 10:57 AM, Sebastian Huber wrote:

This is a proposal to get the gcda data for a gcda info in a free-standing
environment.  It is intended to be used with the -fprofile-info-section option.
A crude test program which doesn't use a linker script is:


Hello.

I'm not pretty sure how this set up is going to work. Can you please explain me 
that?

I was thinking about your needs and I can imagine various techniques how to 
generate
gcda files format:

1) embedded system can override fopen, fwrite, fseek to a functions that do a 
remote
write-related functions

Yes, this is one option, however, the inhibit_libc disables quite a lot of 
libgcov functionality if Newlib is used for example.


I see. Btw do you have available Newlib in the embedded environment? If so, 
what I/O functionality is provided?



2) - use -fprofile-info-section
   - run an app on an embedded system and do a memory dump to a terminal/console
   - take the memory dump to a host system (with IO), run 
__gcov_init_from_memory_dump (...)
 and then do a normal __gcov_dump


I am not sure if a plain memory dump really simplifies things. You have to get 
the filename separately since it is only referenced in gcov_info and not 
included in the structure:

struct gcov_info
{
[...]
   const char *filename;        /* output file name */
[...]
#ifndef IN_GCOV_TOOL
   const struct gcov_fn_info *const *functions; /* pointer to pointers
   to function information  */
[...]
#endif /* !IN_GCOV_TOOL */
};


I see!



Also the gcov_fn_info is not embedded in the gcov_info structure. If you do a 
plain memory dump, then you dump also pointers and how do you deal with these 
pointers on the host? You would need some extra information to describe the 
memory dump. So, why not use the gcda format for this? It is also more compact 
since zero value counters are skipped. Serial lines are slow, so less data to 
transfer is good.

/* Convert the gcov information to a gcda data stream.  The first callback is
    called exactly once with the filename associated with the gcov information.
    The filename may be NULL.  Afterwards, the second callback is subsequently
    called with chunks (the begin and length of the chunk are passed as the
    first two arguments) of the gcda data stream.  The fourth parameter is a
    user-provided argument passed as the last argument to the callback
    functions.  */

extern void __gcov_info_to_gcda (const struct gcov_info *gi_ptr,
                  void (*filename) (const char *name, void *arg),
                  void (*dump) (const void *begin, unsigned size, void *arg),

                  void *arg);

If __gcov_info_to_gcda() is correctly implemented, then this should give you 
directly gcda files if you use something like this:

#include 
#include 

extern const struct gcov_info *__gcov_info_start[];
extern const struct gcov_info *__gcov_info_end[];

static void
filename (const char *f, void *arg)
{
   FILE **file = arg;
   *file = fopen(f, "rb");
}

static void
dump (const void *d, unsigned n, void *arg)
{
   FILE **file = arg;
   fwrite(d, n, 1, *file);
}

static void
dump_gcov_info (void)
{
   const struct gcov_info **info = __gcov_info_start;
   const struct gcov_info **end = __gcov_info_end;

   /* Obfuscate variable to prevent compiler optimizations.  */
   __asm__ ("" : "+r" (end));

   while (info != end)
   {
     FILE *file = NULL;
     __gcov_info_to_gcda (*info, filename, dump, &file);
     fclose(file);
     ++info;
   }
}

int
main()
{
   dump_gcov_info();
   return 0;
}

The callback functions give the user the full control how the data of the gcda 
file is encoded for the transfer to a host. No gcov internals are exposed.



All right. Btw. how will you implement these 2 callbacks on the embedded target?
Apart from these 2 hooks, I bet you will also need gcov_position and gcov_seek 
functions,
can be seen in my sent patch.

Martin

Re: [stage1][PATCH] Change semantics of -frecord-gcc-switches and add -frecord-gcc-switches-format.

2020-11-20 Thread Richard Biener via Gcc-patches

On Fri, Apr 3, 2020 at 8:15 PM Egeyar Bagcioglu
 wrote:
>
>
>
> On 3/18/20 10:05 AM, Martin Liška wrote:
> > On 3/17/20 7:43 PM, Egeyar Bagcioglu wrote:
> >> Hi Martin,
> >>
> >> I like the patch. It definitely serves our purposes at Oracle and
> >> provides another way to do what my previous patches did as well.
> >>
> >> 1) It keeps the backwards compatibility regarding
> >> -frecord-gcc-switches; therefore, removes my related doubts about
> >> your previous patch.
> >>
> >> 2) It still makes use of -frecord-gcc-switches. The new option is
> >> only to control the format. This addresses some previous objections
> >> to having a new option doing something similar. Now the new option
> >> controls the behaviour of the existing one and that behaviour can be
> >> further extended.
> >>
> >> 3) It uses an environment variable as Jakub suggested.
> >>
> >> The patch looks good and I confirm that it works for our purposes.
> >
> > Hello.
> >
> > Thank you for the support.
> >
> >>
> >> Having said that, I have to ask for recognition in this patch for my
> >> and my company's contributions. Can you please keep my name and my
> >> work email in the changelog and in the commit message?
> >
> > Sure, sorry I forgot.
>
> Hi Martin,
>
> I noticed that some comments in the patch were still referring to
> --record-gcc-command-line, the option I suggested earlier. I updated
> those comments to mention -frecord-gcc-switches-format instead and also
> added my name to the patch as you agreed above. I attached the updated
> patch. We are starting to use this patch in the specific domain where we
> need its functionality.

So while I like the addition of -frecord-gcc-switches-format to preserve
backward compatibility there are IMHO several issues with the patch.

For one, the target hook change to a void(void) hook makes it need to
duplicate too much internals.  Please make it take a const char *
argument, the actual text to output.

For second, I see no good reason to have different handling of
-grecord-gcc-switches vs. -frecord-gcc-switches - they should
produce the same content and thus -frecord-gcc-switches-format
should apply to -grecord-gcc-switches as well.

Now - I think what we output into the assembly file with -fverbose-asm
should _always_ be the processed arguments since the assembly
is produced by cc1, not gcc.

+/* Return value of env variable GCC_DRIVER_COMMAND_LINE if exists.
+   Otherwise return empty string.  */
+
+const char *
+get_driver_command_line ()
+{
+  const char *cmdline = getenv ("GCC_DRIVER_COMMAND_LINE");
+  return cmdline != NULL ? cmdline : "";
+}

I think silently emitting nothing is not a good idea.  In particular using
the environment to carry the driver command-line makes it difficult
to reproduce output with pasting commands as dumped by -v [-save-temps].
Likewise the driver seems to always populate GCC_DRIVER_COMMAND_LINE
even if not needed - why not look for -frecord-gcc-switches-format before
doing so?  I'd make -frecord-gcc-switches-format a driver only option
(only the driver can do sth about it) and amend -frecord-gcc-switches
with a -frecord-gcc-switches=FILE variant specifying the content.  The
driver would then substitute -frecord-gcc-switches-format=driver
with -frecord-gcc-switches=tempfile and dump the command line into
tempfile.  cc1 can then pick contents from tempfile or use the processed
variant if no contents are specified.

Richard.


> Regards
> Egeyar
>
>
> >
> > Martin
> >
> >>
> >> Thanks
> >> Egeyar
> >>
> >>
> >>
> >> On 3/17/20 2:53 PM, Martin Liška wrote:
> >>> Hi.
> >>>
> >>> I'm sending enhanced patch that makes the following changes:
> >>> - a new option -frecord-gcc-switches-format is added; the option
> >>>   selects format (processed, driver) for all options that record
> >>>   GCC command line
> >>> - Dwarf gen_produce_string is now used in -fverbose-asm
> >>> - The .s file is affected in the following way:
> >>>
> >>> BEFORE:
> >>>
> >>> # GNU C17 (SUSE Linux) version 9.2.1 20200128 [revision
> >>> 83f65674e78d97d27537361de1a9d74067ff228d] (x86_64-suse-linux)
> >>> #compiled by GNU C version 9.2.1 20200128 [revision
> >>> 83f65674e78d97d27537361de1a9d74067ff228d], GMP version 6.2.0, MPFR
> >>> version 4.0.2, MPC version 1.1.0, isl version isl-0.22.1-GMP
> >>>
> >>> # GGC heuristics: --param ggc-min-expand=100 --param
> >>> ggc-min-heapsize=131072
> >>> # options passed:  -fpreprocessed test.i -march=znver1 -mmmx -mno-3dnow
> >>> # -msse -msse2 -msse3 -mssse3 -msse4a -mcx16 -msahf -mmovbe -maes -msha
> >>> # -mpclmul -mpopcnt -mabm -mno-lwp -mfma -mno-fma4 -mno-xop -mbmi
> >>> -mno-sgx
> >>> # -mbmi2 -mno-pconfig -mno-wbnoinvd -mno-tbm -mavx -mavx2 -msse4.2
> >>> -msse4.1
> >>> # -mlzcnt -mno-rtm -mno-hle -mrdrnd -mf16c -mfsgsbase -mrdseed -mprfchw
> >>> # -madx -mfxsr -mxsave -mxsaveopt -mno-avx512f -mno-avx512er
> >>> -mno-avx512cd
> >>> # -mno-avx512pf -mno-prefetchwt1 -mclflushopt -mxsavec -mxsaves
> >>> # -mno-avx512dq -mno-avx512bw -mno-avx512vl

Use OEP_MATCH_SIDE_EFFECTS in ao_compare_refs

2020-11-20 Thread Jan Hubicka

Hi,
As Jakub reminded me, I introduced OEP_MATCH_SIDE_EFFECTS for cases like
ICF or tail merging where we merge accesses from different code paths.
By default operand_equal_p is designed for accesses from one code path
where we do not want to merge two side effects.

Since compare_ao_refs is currently used only in ICF, we should use it.
If in future it is used in other contextes, perhaps we need a parameter
controlling it, but for now I think adding OEP_MATCH_SIDE_EFFECTS is
good way to go.

This enables merging of volatile accesses in Firefox (we check that the
other access is also volatile).

With this on libxul build we get:

571 libxul.so.wpa.076i.icf:  false returned: 'compare_ao_refs failed 
(dependence clique difference)' in compare_operand at 
../../gcc/ipa-icf-gimple.c:373
676 libxul.so.wpa.076i.icf:  false returned: 'compare_ao_refs failed 
(semantic difference)' in compare_operand at ../../gcc/ipa-icf-gimple.c:361
676 libxul.so.wpa.076i.icf:  false returned: 'METHOD_TYPE and FUNCTION_TYPE 
mismatch' in equals_wpa at ../../gcc/ipa-icf.c:674
707 libxul.so.wpa.076i.icf:  false returned: 'operand_equal_p failed' in 
compare_operand at ../../gcc/ipa-icf-gimple.c:381

Which is very low.  I will still analyze the remaining 707
operand_equal_p failures.  Some of them are deffects how we match
parameters:
_9 = Schedule (this_3(D), aManifestURI_4(D), aDocumentURI_5(D), 
aLoadingPrincipal_6(D), 0B, aWindow_2(D), 0B, aUpdate_7(D));
_9 = Schedule (this_2(D), aManifestURI_3(D), aDocumentURI_4(D), 
aLoadingPrincipal_5(D), 0B, 0B, aProfileDir_6(D), aUpdate_7(D));

Since we do not hash anyting ofr SSA_NAME see it as Schedule (0,0) in
bot cases.  I am testing fix.

Bootstrapped/regtested x86_64-linux, OK?

Honza

* tree-ssa-alias.c (ao_compare::compare_ao_refs,
ao_compare::hash_ao_ref): Use OEP_MATCH_SIDE_EFFECTS.
diff --git a/gcc/tree-ssa-alias.c b/gcc/tree-ssa-alias.c
index 5ebbb087285..311ce66892b 100644
--- a/gcc/tree-ssa-alias.c
+++ b/gcc/tree-ssa-alias.c
@@ -3985,11 +3985,12 @@ ao_compare::compare_ao_refs (ao_ref *ref1, ao_ref *ref2,
return SEMANTICS;
 
   /* Now we can compare the address of actual memory access.  */
-  if (!operand_equal_p (r1, r2, OEP_ADDRESS_OF))
+  if (!operand_equal_p (r1, r2, OEP_ADDRESS_OF | OEP_MATCH_SIDE_EFFECTS))
return SEMANTICS;
 }
   /* For constant accesses we get more matches by comparing offset only.  */
-  else if (!operand_equal_p (base1, base2, OEP_ADDRESS_OF))
+  else if (!operand_equal_p (base1, base2,
+OEP_ADDRESS_OF | OEP_MATCH_SIDE_EFFECTS))
 return SEMANTICS;
 
   /* We can't simply use get_object_alignment_1 on the full
@@ -4197,11 +4198,11 @@ ao_compare::hash_ao_ref (ao_ref *ref, bool 
lto_streaming_safe, bool tbaa,
  r = TREE_OPERAND (r, 0);
}
   hash_operand (TYPE_SIZE (TREE_TYPE (ref->ref)), hstate, 0);
-  hash_operand (r, hstate, OEP_ADDRESS_OF);
+  hash_operand (r, hstate, OEP_ADDRESS_OF | OEP_MATCH_SIDE_EFFECTS);
 }
   else
 {
-  hash_operand (tbase, hstate, OEP_ADDRESS_OF);
+  hash_operand (tbase, hstate, OEP_ADDRESS_OF | OEP_MATCH_SIDE_EFFECTS);
   hstate.add_poly_int (ref->offset);
   hstate.add_poly_int (ref->size);
   hstate.add_poly_int (ref->max_size);

[PATCH] Deal with (pattern) SLP consumed stmts in hybrid discovery

2020-11-20 Thread Richard Biener

This makes hybrid SLP discovery deal with stmts indirectly consumed
by SLP, for example via patterns.  This means that all uses of a
stmt end up in SLP vectorized stmts.

This helps my prototype patches for PR97832 where I make SLP discovery
re-associate chains to make operands match.  This ends up building
SLP computation nodes without 1:1 representatives in the scalar IL
and thus no scalar lane defs in SLP_TREE_SCALAR_STMTS.  Nevertheless
all of the original scalar stmts are consumed so this represents
another kind of SLP pattern for the computation chain result.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

Tamar - can you check if this helps you avoiding all the
relevancy push/pop stuff as well as avoiding any pattern
marking for SLP patterns at all?

2020-11-20  Richard Biener  

* tree-vect-slp.c (maybe_push_to_hybrid_worklist): New function.
(vect_detect_hybrid_slp): Use it.  Perform a backward walk
over the IL.
---
 gcc/tree-vect-slp.c | 79 +
 1 file changed, 72 insertions(+), 7 deletions(-)

diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 486ee95d5d2..f87ac3c049f 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -3439,6 +3439,63 @@ vect_detect_hybrid_slp (tree *tp, int *, void *data)
   return NULL_TREE;
 }
 
+/* Look if STMT_INFO is consumed by SLP indirectly and mark it pure_slp
+   if so, otherwise pushing it to WORKLIST.  */
+
+static void
+maybe_push_to_hybrid_worklist (vec_info *vinfo,
+  vec &worklist,
+  stmt_vec_info stmt_info)
+{
+  if (dump_enabled_p ())
+dump_printf_loc (MSG_NOTE, vect_location,
+"Processing hybrid candidate : %G", stmt_info->stmt);
+  stmt_vec_info orig_info = vect_orig_stmt (stmt_info);
+  imm_use_iterator iter2;
+  ssa_op_iter iter1;
+  use_operand_p use_p;
+  def_operand_p def_p;
+  bool any_def = false;
+  FOR_EACH_PHI_OR_STMT_DEF (def_p, orig_info->stmt, iter1, SSA_OP_DEF)
+{
+  any_def = true;
+  FOR_EACH_IMM_USE_FAST (use_p, iter2, DEF_FROM_PTR (def_p))
+   {
+ stmt_vec_info use_info = vinfo->lookup_stmt (USE_STMT (use_p));
+ /* An out-of loop use means this is a loop_vect sink.  */
+ if (!use_info)
+   {
+ if (dump_enabled_p ())
+   dump_printf_loc (MSG_NOTE, vect_location,
+"Found loop_vect sink: %G", stmt_info->stmt);
+ worklist.safe_push (stmt_info);
+ return;
+   }
+ else if (!STMT_SLP_TYPE (vect_stmt_to_vectorize (use_info)))
+   {
+ if (dump_enabled_p ())
+   dump_printf_loc (MSG_NOTE, vect_location,
+"Found loop_vect use: %G", use_info->stmt);
+ worklist.safe_push (stmt_info);
+ return;
+   }
+   }
+}
+  /* No def means this is a loo_vect sink.  */
+  if (!any_def)
+{
+  if (dump_enabled_p ())
+   dump_printf_loc (MSG_NOTE, vect_location,
+"Found loop_vect sink: %G", stmt_info->stmt);
+  worklist.safe_push (stmt_info);
+  return;
+}
+  if (dump_enabled_p ())
+dump_printf_loc (MSG_NOTE, vect_location,
+"Marked SLP consumed stmt pure: %G", stmt_info->stmt);
+  STMT_SLP_TYPE (stmt_info) = pure_slp;
+}
+
 /* Find stmts that must be both vectorized and SLPed.  */
 
 void
@@ -3448,9 +3505,14 @@ vect_detect_hybrid_slp (loop_vec_info loop_vinfo)
 
   /* All stmts participating in SLP are marked pure_slp, all other
  stmts are loop_vect.
- First collect all loop_vect stmts into a worklist.  */
+ First collect all loop_vect stmts into a worklist.
+ SLP patterns cause not all original scalar stmts to appear in
+ SLP_TREE_SCALAR_STMTS and thus not all of them are marked pure_slp.
+ Rectify this here and do a backward walk over the IL only considering
+ stmts as loop_vect when they are used by a loop_vect stmt and otherwise
+ mark them as pure_slp.  */
   auto_vec worklist;
-  for (unsigned i = 0; i < LOOP_VINFO_LOOP (loop_vinfo)->num_nodes; ++i)
+  for (int i = LOOP_VINFO_LOOP (loop_vinfo)->num_nodes - 1; i >= 0; --i)
 {
   basic_block bb = LOOP_VINFO_BBS (loop_vinfo)[i];
   for (gphi_iterator gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
@@ -3459,10 +3521,11 @@ vect_detect_hybrid_slp (loop_vec_info loop_vinfo)
  gphi *phi = gsi.phi ();
  stmt_vec_info stmt_info = loop_vinfo->lookup_stmt (phi);
  if (!STMT_SLP_TYPE (stmt_info) && STMT_VINFO_RELEVANT (stmt_info))
-   worklist.safe_push (stmt_info);
+   maybe_push_to_hybrid_worklist (loop_vinfo,
+  worklist, stmt_info);
}
-  for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
-  gsi_next (&gsi))
+  for (gimple_stmt_iterator gsi = gsi_last_bb (bb); !gsi_end_p (gsi)

Re: [PATCH] Additional small changes to support opaque modes

2020-11-20 Thread Richard Sandiford via Gcc-patches

acsawdey--- via Gcc-patches  writes:
> diff --git a/gcc/c/c-aux-info.c b/gcc/c/c-aux-info.c
> index ffc8099856d..41f5598de38 100644
> --- a/gcc/c/c-aux-info.c
> +++ b/gcc/c/c-aux-info.c
> @@ -413,6 +413,10 @@ gen_type (const char *ret_val, tree t, formals_style 
> style)
> data_type = IDENTIFIER_POINTER (DECL_NAME (TYPE_NAME (t)));
> break;
>  
> + case OPAQUE_TYPE:
> +   data_type = IDENTIFIER_POINTER (DECL_NAME (TYPE_NAME (t)));
> +   break;
> +

Might as well just add this case to the REAL_TYPE one.

>   case VOID_TYPE:
> data_type = "void";
> break;
> […]
> diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
> index 54eb445665c..d6d12efff34 100644
> --- a/gcc/dwarf2out.c
> +++ b/gcc/dwarf2out.c
> @@ -13037,6 +13037,7 @@ is_base_type (tree type)
>return 1;
>  
>  case VOID_TYPE:
> +case OPAQUE_TYPE:
>  case ARRAY_TYPE:
>  case RECORD_TYPE:
>  case UNION_TYPE:
> @@ -16767,7 +16768,7 @@ loc_descriptor (rtx rtl, machine_mode mode,
>break;
>  
>  case CONST_INT:
> -  if (mode != VOIDmode && mode != BLKmode)
> +  if (mode != VOIDmode && mode != BLKmode && !OPAQUE_MODE_P (mode))
>   {
> int_mode = as_a  (mode);
> loc_result = address_of_int_loc_descriptor (GET_MODE_SIZE (int_mode),

I realise I'm asking this about something that already appears to handle
BLKmode CONST_INTs (?!), but this is the one change in the patch I
struggled with.  Why do we see a CONST_INT that allegedly has an
opaque mode?  It feels like something has gone wrong further up the
call chain.

This might still be the expedient fix for whatever is happening,
but I think it deserves a comment at least.

The rest looks good to me FWIW.

Richard

Hash anonymous namesapce ODR names in ipa-icf

2020-11-20 Thread Jan Hubicka

Hi,
Building libxul 400k mismerges are due to THIS parameters of member functions
of polymorphic types that are in anonymous namespaces.  We already hash ODR
names of the types, but all anonymous type have name "". These can
be distinguished by TYPE_MAIN_VARIANT that is implemented by this patch.

I also noticed that even with !flag_devirtualize we compare tpes of ctors
but not types of normal method, so fixed this.

It would be possible to keep track of merges and disable the late
devirtualization use of this parameter type becuase late devirtualization
is not very useful.  I will look into that.

Bootstrapped/regtested x86_64-linux, will commit it later today if there
are no complains.

Honza

* ipa-icf.c (sem_function::equals_wpa): Only compare ODR types with
flag_devirtualize.
(sem_item_optimizer::update_hash_by_addr_refs): Improve hasing of
anonymous ODR names.
diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c
index 6ae842766e6..0ed5dd92a1b 100644
--- a/gcc/ipa-icf.c
+++ b/gcc/ipa-icf.c
@@ -575,6 +575,7 @@ sem_function::equals_wpa (sem_item *item,
  type memory location for ipa-polymorphic-call and we do not want
  it to get confused by wrong type.  */
   if (DECL_CXX_CONSTRUCTOR_P (decl)
+  && opt_for_fn (decl, flag_devirtualize)
   && TREE_CODE (TREE_TYPE (decl)) == METHOD_TYPE)
 {
   if (TREE_CODE (TREE_TYPE (item->decl)) != METHOD_TYPE)
@@ -2536,11 +2545,25 @@ sem_item_optimizer::update_hash_by_addr_refs ()
  = TYPE_METHOD_BASETYPE (TREE_TYPE (m_items[i]->decl));
inchash::hash hstate (m_items[i]->get_hash ());
 
+   /* Hash ODR types by mangled name if it is defined.
+  If not we know that type is anonymous of free_lang_data
+  was not run and in that case type main variants are
+  unique.  */
if (TYPE_NAME (class_type)
-&& DECL_ASSEMBLER_NAME_SET_P (TYPE_NAME (class_type)))
+&& DECL_ASSEMBLER_NAME_SET_P (TYPE_NAME (class_type))
+&& !type_in_anonymous_namespace_p
+(class_type))
  hstate.add_hwi
(IDENTIFIER_HASH_VALUE
   (DECL_ASSEMBLER_NAME (TYPE_NAME (class_type;
+   else
+ {
+   gcc_checking_assert
+(!in_lto_p
+ || type_in_anonymous_namespace_p
+(TYPE_NAME (class_type)));
+   hstate.add_hwi (TYPE_UID (TYPE_MAIN_VARIANT (class_type)));
+ }
 
m_items[i]->set_hash (hstate.end ());
 }

Re: Use OEP_MATCH_SIDE_EFFECTS in ao_compare_refs

2020-11-20 Thread Richard Biener via Gcc-patches

On Fri, Nov 20, 2020 at 10:53 AM Jan Hubicka  wrote:
>
> Hi,
> As Jakub reminded me, I introduced OEP_MATCH_SIDE_EFFECTS for cases like
> ICF or tail merging where we merge accesses from different code paths.
> By default operand_equal_p is designed for accesses from one code path
> where we do not want to merge two side effects.
>
> Since compare_ao_refs is currently used only in ICF, we should use it.
> If in future it is used in other contextes, perhaps we need a parameter
> controlling it, but for now I think adding OEP_MATCH_SIDE_EFFECTS is
> good way to go.
>
> This enables merging of volatile accesses in Firefox (we check that the
> other access is also volatile).
>
> With this on libxul build we get:
>
> 571 libxul.so.wpa.076i.icf:  false returned: 'compare_ao_refs failed 
> (dependence clique difference)' in compare_operand at 
> ../../gcc/ipa-icf-gimple.c:373
> 676 libxul.so.wpa.076i.icf:  false returned: 'compare_ao_refs failed 
> (semantic difference)' in compare_operand at ../../gcc/ipa-icf-gimple.c:361
> 676 libxul.so.wpa.076i.icf:  false returned: 'METHOD_TYPE and 
> FUNCTION_TYPE mismatch' in equals_wpa at ../../gcc/ipa-icf.c:674
> 707 libxul.so.wpa.076i.icf:  false returned: 'operand_equal_p failed' in 
> compare_operand at ../../gcc/ipa-icf-gimple.c:381
>
> Which is very low.  I will still analyze the remaining 707
> operand_equal_p failures.  Some of them are deffects how we match
> parameters:
> _9 = Schedule (this_3(D), aManifestURI_4(D), aDocumentURI_5(D), 
> aLoadingPrincipal_6(D), 0B, aWindow_2(D), 0B, aUpdate_7(D));
> _9 = Schedule (this_2(D), aManifestURI_3(D), aDocumentURI_4(D), 
> aLoadingPrincipal_5(D), 0B, 0B, aProfileDir_6(D), aUpdate_7(D));
>
> Since we do not hash anyting ofr SSA_NAME see it as Schedule (0,0) in
> bot cases.  I am testing fix.
>
> Bootstrapped/regtested x86_64-linux, OK?

OK.

>
> Honza
>
> * tree-ssa-alias.c (ao_compare::compare_ao_refs,
> ao_compare::hash_ao_ref): Use OEP_MATCH_SIDE_EFFECTS.
> diff --git a/gcc/tree-ssa-alias.c b/gcc/tree-ssa-alias.c
> index 5ebbb087285..311ce66892b 100644
> --- a/gcc/tree-ssa-alias.c
> +++ b/gcc/tree-ssa-alias.c
> @@ -3985,11 +3985,12 @@ ao_compare::compare_ao_refs (ao_ref *ref1, ao_ref 
> *ref2,
> return SEMANTICS;
>
>/* Now we can compare the address of actual memory access.  */
> -  if (!operand_equal_p (r1, r2, OEP_ADDRESS_OF))
> +  if (!operand_equal_p (r1, r2, OEP_ADDRESS_OF | OEP_MATCH_SIDE_EFFECTS))
> return SEMANTICS;
>  }
>/* For constant accesses we get more matches by comparing offset only.  */
> -  else if (!operand_equal_p (base1, base2, OEP_ADDRESS_OF))
> +  else if (!operand_equal_p (base1, base2,
> +OEP_ADDRESS_OF | OEP_MATCH_SIDE_EFFECTS))
>  return SEMANTICS;
>
>/* We can't simply use get_object_alignment_1 on the full
> @@ -4197,11 +4198,11 @@ ao_compare::hash_ao_ref (ao_ref *ref, bool 
> lto_streaming_safe, bool tbaa,
>   r = TREE_OPERAND (r, 0);
> }
>hash_operand (TYPE_SIZE (TREE_TYPE (ref->ref)), hstate, 0);
> -  hash_operand (r, hstate, OEP_ADDRESS_OF);
> +  hash_operand (r, hstate, OEP_ADDRESS_OF | OEP_MATCH_SIDE_EFFECTS);
>  }
>else
>  {
> -  hash_operand (tbase, hstate, OEP_ADDRESS_OF);
> +  hash_operand (tbase, hstate, OEP_ADDRESS_OF | OEP_MATCH_SIDE_EFFECTS);
>hstate.add_poly_int (ref->offset);
>hstate.add_poly_int (ref->size);
>hstate.add_poly_int (ref->max_size);

[PATCH] Simplified construction of constants for popcountSI2/popcountDI2 in libgcc2.c

2020-11-20 Thread Stefan Kanthak

The construction of the "magic" constants 0x55...55, 0x33...33, 0x0f...0f
and 0x01...01 in __popcountSI2 and __popcountDI2 with macros is awkward;
these constants can simply be written as ((UWtype) ~0 / 3),
((UWtype) ~0 / 5), ((UWtype) ~0 / 17) and ((UWtype) ~0 / 255)

Stefan Kanthak

libgcc2.patch
Description: Binary data

Re: [PATCH] gcov: Add __gcov_info_to_gdca()

2020-11-20 Thread Sebastian Huber


On 20/11/2020 10:49, Martin Liška wrote:


On 11/20/20 10:25 AM, Sebastian Huber wrote:

On 20/11/2020 09:37, Martin Liška wrote:


On 11/17/20 10:57 AM, Sebastian Huber wrote:
This is a proposal to get the gcda data for a gcda info in a 
free-standing
environment.  It is intended to be used with the 
-fprofile-info-section option.

A crude test program which doesn't use a linker script is:


Hello.

I'm not pretty sure how this set up is going to work. Can you please 
explain me that?


I was thinking about your needs and I can imagine various techniques 
how to generate

gcda files format:

1) embedded system can override fopen, fwrite, fseek to a functions 
that do a remote

write-related functions
Yes, this is one option, however, the inhibit_libc disables quite a 
lot of libgcov functionality if Newlib is used for example.


I see. Btw do you have available Newlib in the embedded environment? 
If so, what I/O functionality is provided?
Yes, I use Newlib with the RTEMS real-time operating system. Newlib 
provides the standard C library I/O functions (fopen, etc.). However, 
having Newlib available doesn't mean that every application uses its. 
Applications are statically linked with the operating system and Newlib. 
They only use what is required. Some applications cannot use the 
standard C library I/O since they use a lot of infrastructure and 
memory. You can do a lot of things with just a couple of KiBs available.




2) - use -fprofile-info-section
   - run an app on an embedded system and do a memory dump to a 
terminal/console
   - take the memory dump to a host system (with IO), run 
__gcov_init_from_memory_dump (...)

 and then do a normal __gcov_dump


I am not sure if a plain memory dump really simplifies things. You 
have to get the filename separately since it is only referenced in 
gcov_info and not included in the structure:


struct gcov_info
{
[...]
   const char *filename;        /* output file name */
[...]
#ifndef IN_GCOV_TOOL
   const struct gcov_fn_info *const *functions; /* pointer to pointers
   to function 
information  */

[...]
#endif /* !IN_GCOV_TOOL */
};


I see!



Also the gcov_fn_info is not embedded in the gcov_info structure. If 
you do a plain memory dump, then you dump also pointers and how do 
you deal with these pointers on the host? You would need some extra 
information to describe the memory dump. So, why not use the gcda 
format for this? It is also more compact since zero value counters 
are skipped. Serial lines are slow, so less data to transfer is good.


/* Convert the gcov information to a gcda data stream.  The first 
callback is
    called exactly once with the filename associated with the gcov 
information.
    The filename may be NULL.  Afterwards, the second callback is 
subsequently
    called with chunks (the begin and length of the chunk are passed 
as the
    first two arguments) of the gcda data stream.  The fourth 
parameter is a

    user-provided argument passed as the last argument to the callback
    functions.  */

extern void __gcov_info_to_gcda (const struct gcov_info *gi_ptr,
              void (*filename) (const char *name, void *arg),
              void (*dump) (const void *begin, unsigned size, 
void *arg),


              void *arg);

If __gcov_info_to_gcda() is correctly implemented, then this should 
give you directly gcda files if you use something like this:


#include 
#include 

extern const struct gcov_info *__gcov_info_start[];
extern const struct gcov_info *__gcov_info_end[];

static void
filename (const char *f, void *arg)
{
   FILE **file = arg;
   *file = fopen(f, "rb");
}

static void
dump (const void *d, unsigned n, void *arg)
{
   FILE **file = arg;
   fwrite(d, n, 1, *file);
}

static void
dump_gcov_info (void)
{
   const struct gcov_info **info = __gcov_info_start;
   const struct gcov_info **end = __gcov_info_end;

   /* Obfuscate variable to prevent compiler optimizations.  */
   __asm__ ("" : "+r" (end));

   while (info != end)
   {
 FILE *file = NULL;
 __gcov_info_to_gcda (*info, filename, dump, &file);
 fclose(file);
 ++info;
   }
}

int
main()
{
   dump_gcov_info();
   return 0;
}

The callback functions give the user the full control how the data of 
the gcda file is encoded for the transfer to a host. No gcov 
internals are exposed.




All right. Btw. how will you implement these 2 callbacks on the 
embedded target?


One options is to convert the gcov info to YAML:

gcov-info:

- file: filename1

  data: <... base64 encoded data from __gcov_info_to_gcda ... >

- file: filename2

  data: ...

Then send the data to the host via a serial line. On the host read the 
data, parse the YAML, and create the gcda files. The 
__gcov_info_to_gcda() needs about 408 bytes of ARM Thumb-2 code and no 
data. You need a polled character output function, the linker set 
iteration and two callbacks. So, you can easily dump the gcov 
information

Re: [PATCH] Simplified construction of constants for popcountSI2/popcountDI2 in libgcc2.c

2020-11-20 Thread Jakub Jelinek via Gcc-patches

On Fri, Nov 20, 2020 at 11:08:41AM +0100, Stefan Kanthak wrote:
> The construction of the "magic" constants 0x55...55, 0x33...33, 0x0f...0f
> and 0x01...01 in __popcountSI2 and __popcountDI2 with macros is awkward;
> these constants can simply be written as ((UWtype) ~0 / 3),
> ((UWtype) ~0 / 5), ((UWtype) ~0 / 17) and ((UWtype) ~0 / 255)

(UWtype) ~0 will only work if UWtype is unsigned int, don't you really mean
~(UWtype) 0 instead?

Jakub

Re: [PATCH] Remove lambdas from _Rb_tree

2020-11-20 Thread Jonathan Wakely via Gcc-patches


On 20/11/20 08:17 +0100, FranÃ§ois Dumont via Libstdc++ wrote:

Here is what I am testing.

I use your enum proposal as an alias for the bool type. I cannot use 
it as template parameter on _M_copy unless I put it at std namespace 
level to use it in definition of the outline _M_copy overload.


You can do it as a member, but the syntax is ugly:

  template
template::_CopyType 
_MoveValues, typename _NodeGen>
  typename _Rb_tree<_Key, _Val, _KoV, _Compare, _Alloc>::_Link_type
  _Rb_tree<_Key, _Val, _KoV, _Compare, _Alloc>::
  _M_copy(_Link_type __x, _Base_ptr __p, _NodeGen& __node_gen)





I also added some tests checking correct usage of 
__move_if_noexcept_cond. I prefer not to change this condition as 
proposed in this patch.


Yes, we can do that later if needed.


I wonder if I am right to check moved values in those tests ?


Yes, I think that's good.

I also wonder after writing those tests if we shouldn't clear the 
moved instance, especially when values are moved ? I remember seeing 
some discussion about this but I don't know the conclusion.


It's not required to clear them.

Leaving them with moved-from values means the memory isn't
deallocated, and those nodes can be reused if the container gets
assigned new values.


Â Â Â  libstdc++: _Rb_tree code cleanup, remove lambdas

Â Â Â  Use new template parameters to replace usage of lambdas to move or not
Â Â Â  tree values on copy.

Â Â Â  libstdc++-v3/ChangeLog:

Â Â Â Â Â Â Â Â Â Â Â  * include/bits/move.h (_GLIBCXX_FWDREF): New.
Â Â Â Â Â Â Â Â Â Â Â  * include/bits/stl_tree.h: Adapt to use latter.
Â Â Â Â Â Â Â Â Â Â Â  (_Rb_tree<>::_M_clone_node): Add _MoveValue template 
parameter.
Â Â Â Â Â Â Â Â Â Â Â  (_Rb_tree<>::_M_mbegin): New.
Â Â Â Â Â Â Â Â Â Â Â  (_Rb_tree<>::_M_begin): Use latter.
Â Â Â Â Â Â Â Â Â Â Â  (_Rb_tree<>::_M_copy): Add _MoveValues template 
parameter.
Â Â Â Â Â Â Â Â Â Â Â  * testsuite/23_containers/map/allocator/move_cons.cc: 
New test.
Â Â Â Â Â Â Â Â Â Â Â  * testsuite/23_containers/multimap/allocator/move_cons.cc: 
New test.
Â Â Â Â Â Â Â Â Â Â Â  * testsuite/23_containers/multiset/allocator/move_cons.cc: 
New test.

Â Â Â Â Â Â Â Â Â Â Â  * testsuite/23_containers/set/allocator/move_cons.cc: 
New test.

Ok to commit once all tests have complete ?


Yes, OK for trunk, thanks!

RE: [PATCH] [PR target/97726] arm: [testsuite] fix some simd tests on armbe

2020-11-20 Thread Kyrylo Tkachov via Gcc-patches




> -Original Message-
> From: Andrea Corallo 
> Sent: 16 November 2020 16:11
> To: Andrea Corallo via Gcc-patches 
> Cc: nd ; Richard Earnshaw ;
> Kyrylo Tkachov 
> Subject: [PATCH] [PR target/97726] arm: [testsuite] fix some simd tests on
> armbe
> 
> Andrea Corallo via Gcc-patches  writes:
> 
> > Hi all,
> >
> > I'd like to submit this patch to fix three testcases reported to be
> > failing on arm big endian on PR target/97727.
> >
> > Okay for trunk?
> >
> > Thanks
> >
> >   Andrea
> 
> Ops I got the PR number wrong, target/97726 is the correct one.
> 
> Attached the updated patch+changelog.
> 
> Sorry for the trouble.

Ok.
Thanks,
Kyrill

> 
>   Andrea

Re: [PATCH 01/31] PR target/58901: reload: Handle SUBREG of MEM with a mode-dependent address

2020-11-20 Thread Eric Botcazou

> First posted at: .

>   gcc/
>   PR target/58901
>   * reload.c (reload_inner_reg_of_subreg): Also request reloading
>   for pseudo registers associated with mode dependent memory
>   references.
>   (push_reload): Handle pseudo registers.

The handling of this family of reloads is supposed to be done by the block of 
code just above though, i.e. at line 1023.  Can't we add the test based on 
mode_dependent_address_p to this block, e.g. after:

  || (REG_P (SUBREG_REG (in))
  && REGNO (SUBREG_REG (in)) < FIRST_PSEUDO_REGISTER
  && !REG_CAN_CHANGE_MODE_P (REGNO (SUBREG_REG (in)),
 GET_MODE (SUBREG_REG 
(in)), inmode

instead?

-- 
Eric Botcazou

[PATCH] c++, v3: Add __builtin_clear_padding builtin - C++20 P0528R3 compiler side [PR88101]

2020-11-20 Thread Jakub Jelinek via Gcc-patches

On Fri, Nov 20, 2020 at 09:19:31AM +, Richard Biener wrote:
> > --- gcc/builtins.c.jj   2020-11-19 12:34:10.749514278 +0100
> > +++ gcc/builtins.c  2020-11-19 16:23:55.261250903 +0100
> > @@ -11189,6 +11189,13 @@ fold_builtin_1 (location_t loc, tree exp
> > return build_empty_stmt (loc);
> >break;
> >  
> > +case BUILT_IN_CLEAR_PADDING:
> > +  /* Remember the original type of the argument in an internal
> > +dummy second argument, as in GIMPLE pointer conversions are
> > +useless.  */
> > +  return build_call_expr_loc (loc, fndecl, 2, arg0,
> > + build_zero_cst (TREE_TYPE (arg0)));
> > +
> 
> I'd rather make this change during gimplify_call_expr, I'm not
> even sure at which point we'd hit the above (and if at all).

As discussed on IRC, it was called from the FE folding or could be even from
gimplify_call_expr which also folds builtins and it wasn't recursing
because after it got second argument fold_builtin_1 wouldn't be called on it
anymore (but fold_builtin_2).
Anyway, I've implemented it now in gimplify_call_expr instead.

> You're using alias-set zero for all stores but since we're
> effectively inspecting the layout of a type we could specify
> that __builtin_clear_padding expects an object of said type
> at the very location active as dynamic type.  So IMHO
> less conservative but still sound would be to use
> build_pointer_type (aggregate-type) in place of ptr_type_node.
> 
> I'm not sure it will make a difference but it might be useful
> to parametrize 'ptr_type_node' in the IL generation?

Ok, added alias_type to the data structure which is passed around,
and initialized it to pointer to the toplevel type (except for VLAs,
in that case just to pointer to the VLA element type).

Here is the updated patch:

2020-11-20  Jakub Jelinek  

PR libstdc++/88101
gcc/
* builtins.def (BUILT_IN_CLEAR_PADDING): New built-in function.
* gimplify.c (gimplify_call_expr): Rewrite single argument
BUILT_IN_CLEAR_PADDING into two-argument variant.
* gimple-fold.c (clear_padding_unit, clear_padding_buf_size): New
const variables.
(struct clear_padding_struct): New type.
(clear_padding_flush, clear_padding_add_padding,
clear_padding_emit_loop, clear_padding_type,
clear_padding_union, clear_padding_real_needs_padding_p,
clear_padding_type_may_have_padding_p,
gimple_fold_builtin_clear_padding): New functions.
(gimple_fold_builtin): Handle BUILT_IN_CLEAR_PADDING.
* doc/extend.texi (__builtin_clear_padding): Document.
gcc/c-family/
* c-common.c (check_builtin_function_arguments): Handle
BUILT_IN_CLEAR_PADDING.
gcc/testsuite/
* c-c++-common/builtin-clear-padding-1.c: New test.
* c-c++-common/torture/builtin-clear-padding-1.c: New test.
* c-c++-common/torture/builtin-clear-padding-2.c: New test.
* c-c++-common/torture/builtin-clear-padding-3.c: New test.
* c-c++-common/torture/builtin-clear-padding-4.c: New test.
* c-c++-common/torture/builtin-clear-padding-5.c: New test.
* g++.dg/torture/builtin-clear-padding-1.C: New test.
* g++.dg/torture/builtin-clear-padding-2.C: New test.
* gcc.dg/builtin-clear-padding-1.c: New test.

--- gcc/builtins.def.jj 2020-11-19 20:00:47.116518082 +0100
+++ gcc/builtins.def2020-11-20 10:51:44.715684363 +0100
@@ -839,6 +839,7 @@ DEF_EXT_LIB_BUILTIN(BUILT_IN_CLEAR_C
 /* [trans-mem]: Adjust BUILT_IN_TM_CALLOC if BUILT_IN_CALLOC is changed.  */
 DEF_LIB_BUILTIN(BUILT_IN_CALLOC, "calloc", BT_FN_PTR_SIZE_SIZE, 
ATTR_MALLOC_WARN_UNUSED_RESULT_SIZE_1_2_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN(BUILT_IN_CLASSIFY_TYPE, "classify_type", 
BT_FN_INT_VAR, ATTR_LEAF_LIST)
+DEF_GCC_BUILTIN(BUILT_IN_CLEAR_PADDING, "clear_padding", 
BT_FN_VOID_VAR, ATTR_NOTHROW_NONNULL_TYPEGENERIC_LEAF)
 DEF_GCC_BUILTIN(BUILT_IN_CLZ, "clz", BT_FN_INT_UINT, 
ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN(BUILT_IN_CLZIMAX, "clzimax", BT_FN_INT_UINTMAX, 
ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN(BUILT_IN_CLZL, "clzl", BT_FN_INT_ULONG, 
ATTR_CONST_NOTHROW_LEAF_LIST)
--- gcc/gimplify.c.jj   2020-11-20 08:43:52.262473979 +0100
+++ gcc/gimplify.c  2020-11-20 10:58:37.035125705 +0100
@@ -3384,6 +3384,20 @@ gimplify_call_expr (tree *expr_p, gimple
cfun->calls_eh_return = true;
break;

+  case BUILT_IN_CLEAR_PADDING:
+   if (call_expr_nargs (*expr_p) == 1)
+ {
+   /* Remember the original type of the argument in an internal
+  dummy second argument, as in GIMPLE pointer conversions are
+  useless.  */
+   p = CALL_EXPR_ARG (*expr_p, 0);
+   *expr_p
+ = build_call_expr_loc (EXPR_LOCATION (*expr_p), fndecl, 2, p,
+build_zero_cst (TREE_TYPE (p)));
+   return GS_OK;
+ }
+   b

Re: [PATCH] Additional small changes to support opaque modes

2020-11-20 Thread Aaron Sawdey via Gcc-patches



> On Nov 20, 2020, at 3:55 AM, Richard Sandiford  
> wrote:
> 
> acsawdey--- via Gcc-patches  writes:
>> diff --git a/gcc/c/c-aux-info.c b/gcc/c/c-aux-info.c
>> index ffc8099856d..41f5598de38 100644
>> --- a/gcc/c/c-aux-info.c
>> +++ b/gcc/c/c-aux-info.c
>> @@ -413,6 +413,10 @@ gen_type (const char *ret_val, tree t, formals_style 
>> style)
>>data_type = IDENTIFIER_POINTER (DECL_NAME (TYPE_NAME (t)));
>>break;
>> 
>> +case OPAQUE_TYPE:
>> +  data_type = IDENTIFIER_POINTER (DECL_NAME (TYPE_NAME (t)));
>> +  break;
>> +
> 
> Might as well just add this case to the REAL_TYPE one.
> 
>>  case VOID_TYPE:
>>data_type = "void";
>>break;
>> […]
>> diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
>> index 54eb445665c..d6d12efff34 100644
>> --- a/gcc/dwarf2out.c
>> +++ b/gcc/dwarf2out.c
>> @@ -13037,6 +13037,7 @@ is_base_type (tree type)
>>   return 1;
>> 
>> case VOID_TYPE:
>> +case OPAQUE_TYPE:
>> case ARRAY_TYPE:
>> case RECORD_TYPE:
>> case UNION_TYPE:
>> @@ -16767,7 +16768,7 @@ loc_descriptor (rtx rtl, machine_mode mode,
>>   break;
>> 
>> case CONST_INT:
>> -  if (mode != VOIDmode && mode != BLKmode)
>> +  if (mode != VOIDmode && mode != BLKmode && !OPAQUE_MODE_P (mode))
>>  {
>>int_mode = as_a  (mode);
>>loc_result = address_of_int_loc_descriptor (GET_MODE_SIZE (int_mode),
> 
> I realise I'm asking this about something that already appears to handle
> BLKmode CONST_INTs (?!), but this is the one change in the patch I
> struggled with.  Why do we see a CONST_INT that allegedly has an
> opaque mode?  It feels like something has gone wrong further up the
> call chain.
> 
> This might still be the expedient fix for whatever is happening,
> but I think it deserves a comment at least.
> 
> The rest looks good to me FWIW.
> 
> Richard

I should look at this again — since I originally put that in, I switched the 
target
portion of what I’ve been doing to use an UNSPEC to remove all use of an
opaque mode const_int from the rtf. This may not be needed any more. 

Thanks,
   Aaron

Aaron Sawdey, Ph.D. saw...@linux.ibm.com
IBM Linux on POWER Toolchain

RE: [PATCH] arm: Fix up neon_vector_mem_operand [PR97528]

2020-11-20 Thread Kyrylo Tkachov via Gcc-patches




> -Original Message-
> From: Jakub Jelinek 
> Sent: 19 November 2020 18:57
> To: Richard Earnshaw ; Ramana
> Radhakrishnan ; Kyrylo Tkachov
> 
> Cc: gcc-patches@gcc.gnu.org
> Subject: [PATCH] arm: Fix up neon_vector_mem_operand [PR97528]
> 
> Hi!
> 
> The documentation for POST_MODIFY says:
>Currently, the compiler can only handle second operands of the
>form (plus (reg) (reg)) and (plus (reg) (const_int)), where
>the first operand of the PLUS has to be the same register as
>the first operand of the *_MODIFY.
> The following testcase ICEs, because combine just attempts to simplify
> things and ends up with
> (post_modify (reg1) (plus (mult (reg2) (const_int 4)) (reg1))
> but the target predicates accept it, because they only verify
> that POST_MODIFY's second operand is PLUS and the second operand
> of the PLUS is a REG.
> 
> The following patch fixes this by performing further verification that
> the POST_MODIFY is in the form it should be.
> 
> Bootstrapped/regtested on armv7hl-linux-gnueabi, ok for trunk
> and release branches after a while?

Ok.
Thanks,
Kyrill

> 
> 2020-11-19  Jakub Jelinek  
> 
>   PR target/97528
>   * config/arm/arm.c (neon_vector_mem_operand): For
> POST_MODIFY, require
>   first POST_MODIFY operand is a REG and is equal to the first operand
>   of PLUS.
> 
>   * gcc.target/arm/pr97528.c: New test.
> 
> --- gcc/config/arm/arm.c.jj   2020-11-13 19:00:46.729620560 +0100
> +++ gcc/config/arm/arm.c  2020-11-18 17:05:44.656867343 +0100
> @@ -13429,7 +13429,9 @@ neon_vector_mem_operand (rtx op, int typ
>/* Allow post-increment by register for VLDn */
>if (type == 2 && GET_CODE (ind) == POST_MODIFY
>&& GET_CODE (XEXP (ind, 1)) == PLUS
> -  && REG_P (XEXP (XEXP (ind, 1), 1)))
> +  && REG_P (XEXP (XEXP (ind, 1), 1))
> +  && REG_P (XEXP (ind, 0))
> +  && rtx_equal_p (XEXP (ind, 0), XEXP (XEXP (ind, 1), 0)))
>   return true;
> 
>/* Match:
> --- gcc/testsuite/gcc.target/arm/pr97528.c.jj 2020-11-18
> 17:09:58.195053288 +0100
> +++ gcc/testsuite/gcc.target/arm/pr97528.c2020-11-18
> 17:09:47.839168237 +0100
> @@ -0,0 +1,28 @@
> +/* PR target/97528 */
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_neon_ok } */
> +/* { dg-options "-O1" }  */
> +/* { dg-add-options arm_neon } */
> +
> +#include 
> +
> +typedef __simd64_int16_t T;
> +typedef __simd64_uint16_t U;
> +unsigned short c;
> +int d;
> +U e;
> +
> +void
> +foo (void)
> +{
> +  unsigned short *dst = &c;
> +  int g = d, b = 4;
> +  U dc = e;
> +  for (int h = 0; h < b; h++)
> +{
> +  unsigned short *i = dst;
> +  U j = dc;
> +  vst1_s16 ((int16_t *) i, (T) j);
> +  dst += g;
> +}
> +}
> 
> 
>   Jakub

Re: [PATCH] Simplified construction of constants for popcountSI2/popcountDI2 in libgcc2.c

2020-11-20 Thread Stefan Kanthak

Jakub Jelinek  wrote:

> On Fri, Nov 20, 2020 at 11:08:41AM +0100, Stefan Kanthak wrote:
>> The construction of the "magic" constants 0x55...55, 0x33...33, 0x0f...0f
>> and 0x01...01 in __popcountSI2 and __popcountDI2 with macros is awkward;
>> these constants can simply be written as ((UWtype) ~0 / 3),
>> ((UWtype) ~0 / 5), ((UWtype) ~0 / 17) and ((UWtype) ~0 / 255)
> 
> (UWtype) ~0 will only work if UWtype is unsigned int,

Hmmm... U*type is but unsigned, and both (__uint128_t) ~0 / 3 as well as
(unsigned long long) ~0 / 3 yield the expected constant 0x55...55 here,
and the other constants of course too.

> don't you really mean ~(UWtype) 0 instead?

This is indeed the better^Wcorrect solution.
Corrected patch attached.

Stefan

libgcc2.patch
Description: Binary data

RE: [PATCH] Deal with (pattern) SLP consumed stmts in hybrid discovery

2020-11-20 Thread Tamar Christina via Gcc-patches

Hi Richi,

> -Original Message-
> From: rguent...@ryzen.fritz.box  On Behalf Of
> Richard Biener
> Sent: Friday, November 20, 2020 9:54 AM
> To: gcc-patches@gcc.gnu.org
> Cc: Tamar Christina 
> Subject: [PATCH] Deal with (pattern) SLP consumed stmts in hybrid discovery
> 
> This makes hybrid SLP discovery deal with stmts indirectly consumed by SLP,
> for example via patterns.  This means that all uses of a stmt end up in SLP
> vectorized stmts.
> 
> This helps my prototype patches for PR97832 where I make SLP discovery re-
> associate chains to make operands match.  This ends up building SLP
> computation nodes without 1:1 representatives in the scalar IL and thus no
> scalar lane defs in SLP_TREE_SCALAR_STMTS.  Nevertheless all of the original
> scalar stmts are consumed so this represents another kind of SLP pattern for
> the computation chain result.
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu.
> 
> Tamar - can you check if this helps you avoiding all the relevancy push/pop
> stuff as well as avoiding any pattern marking for SLP patterns at all?
> 

Looks like it does allow me to avoid the relevancy and SLP_TYPE markings:

=== vect_detect_hybrid_slp ===
Processing hybrid candidate : slp_patt_48 = .COMPLEX_MUL (_pa_50, _pa_49);
Marked SLP consumed stmt pure: slp_patt_48 = .COMPLEX_MUL (_pa_50, _pa_49);
Processing hybrid candidate : slp_patt_51 = .COMPLEX_MUL (_pa_53, _pa_52);
Marked SLP consumed stmt pure: slp_patt_51 = .COMPLEX_MUL (_pa_53, _pa_52);
Processing hybrid candidate : _24 = _9 * _19;
Marked SLP consumed stmt pure: _24 = _9 * _19;
Processing hybrid candidate : _23 = _10 * _18;
Marked SLP consumed stmt pure: _23 = _10 * _18;
Processing hybrid candidate : _22 = _9 * _18;
Marked SLP consumed stmt pure: _22 = _9 * _18;
Processing hybrid candidate : _17 = _10 * _19;
Marked SLP consumed stmt pure: _17 = _10 * _19;

And costing looks to be ignoring the unrelated statements

0x50bd800 REALPART_EXPR <*_3> 1 times unaligned_load (misalign -1) costs 1 in 
body
0x50bd800 REALPART_EXPR <*_5> 1 times unaligned_load (misalign -1) costs 1 in 
body
0x50bd800 .COMPLEX_MUL (_pa_53, _pa_52) 1 times vector_stmt costs 1 in body
0x50bd800 REALPART_EXPR <*_5> 1 times unaligned_load (misalign -1) costs 1 in 
body
0x50bd800  0 times vec_perm costs 0 in body
0x50bd800 .COMPLEX_MUL (_pa_50, _pa_49) 1 times vector_stmt costs 1 in body
0x50bd800  1 times vec_perm costs 2 in body
0x50bd800 _25 1 times unaligned_store (misalign -1) costs 1 in body

But it doesn't allow me to avoid the pattern markings.

Without the pattern markings it will crash when analyzing the loads

note:   ==> examining statement: _26 = REALPART_EXPR <*_5>;
note:   Vectorizing an unaligned access.
note:   vect_model_load_cost: unaligned supported by hardware.
note:   vect_model_load_cost: inside_cost = 1, prologue_cost = 0 .

as the statement it finds is not being used:

cadd.c:22:6: internal compiler error: in vect_slp_analyze_node_operations_1, at 
tree-vect-slp.c:3591
   22 | void f90 (TYPE complex a[restrict N], TYPE complex b[restrict N], TYPE 
complex c[restrict N])
  |  ^~~
0x17200e1 vect_slp_analyze_node_operations_1
/gnu-work/src/gcc/gcc/tree-vect-slp.c:3591
0x17209e8 vect_slp_analyze_node_operations
/gnu-work/src/gcc/gcc/tree-vect-slp.c:3784
0x1720995 vect_slp_analyze_node_operations
/gnu-work/src/gcc/gcc/tree-vect-slp.c:3776
0x1720995 vect_slp_analyze_node_operations
/gnu-work/src/gcc/gcc/tree-vect-slp.c:3776
0x1720995 vect_slp_analyze_node_operations
/gnu-work/src/gcc/gcc/tree-vect-slp.c:3776
0x1720995 vect_slp_analyze_node_operations
/gnu-work/src/gcc/gcc/tree-vect-slp.c:3776
0x17212f2 vect_slp_analyze_operations(vec_info*)
/gnu-work/src/gcc/gcc/tree-vect-slp.c:3971
0x16e2c6e vect_analyze_loop_2
/gnu-work/src/gcc/gcc/tree-vect-loop.c:2419
0x16e4bcd vect_analyze_loop(loop*, vec_info_shared*)
/gnu-work/src/gcc/gcc/tree-vect-loop.c:2956
0x173ba4d try_vectorize_loop_1
/gnu-work/src/gcc/gcc/tree-vectorizer.c:1010
0x173c23a try_vectorize_loop
/gnu-work/src/gcc/gcc/tree-vectorizer.c:1163
0x173c470 vectorize_loops()
/gnu-work/src/gcc/gcc/tree-vectorizer.c:1244
0x1579275 execute
/gnu-work/src/gcc/gcc/tree-ssa-loop.c:414

I will assume the patch's existence in tree when I send out my COMPLEX_ADD 
patch today then.

Thanks!,
Tamar

> 2020-11-20  Richard Biener  
> 
>   * tree-vect-slp.c (maybe_push_to_hybrid_worklist): New function.
>   (vect_detect_hybrid_slp): Use it.  Perform a backward walk
>   over the IL.
> ---
>  gcc/tree-vect-slp.c | 79
> +
>  1 file changed, 72 insertions(+), 7 deletions(-)
> 
> diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c index
> 486ee95d5d2..f87ac3c049f 100644
> --- a/gcc/tree-vect-slp.c
> +++ b/gcc/tree-vect-slp.c
> @@ -3439,6 +3439,63 @@ vect_detect_hybrid_slp (tree *tp, int *, void
> *data)
>return NULL_TREE;
>  }
> 
> +

Re: [PATCH] libstdc++: Enable without gthreads

2020-11-20 Thread Jonathan Wakely via Gcc-patches


On 19/11/20 19:08 +, Jonathan Wakely wrote:

On 19/11/20 13:36 +, Jonathan Wakely wrote:

On 16/11/20 14:43 -0800, Thomas Rodgers wrote:

This patch looks good to me.


Committed now.


This patch was also needed, but I don't understand why I didn't see
the FAILs on gcc135 in teh cfarm.


PCH. My test config on gcc135 uses PCH, and so all headers were
included. The host I tested on later uses --disable-libstdcxx-pch and
so saw the failure.

Mystery solved.

Re: [PATCH] c++, v3: Add __builtin_clear_padding builtin - C++20 P0528R3 compiler side [PR88101]

2020-11-20 Thread Richard Biener

On Fri, 20 Nov 2020, Jakub Jelinek wrote:

> On Fri, Nov 20, 2020 at 09:19:31AM +, Richard Biener wrote:
> > > --- gcc/builtins.c.jj 2020-11-19 12:34:10.749514278 +0100
> > > +++ gcc/builtins.c2020-11-19 16:23:55.261250903 +0100
> > > @@ -11189,6 +11189,13 @@ fold_builtin_1 (location_t loc, tree exp
> > >   return build_empty_stmt (loc);
> > >break;
> > >  
> > > +case BUILT_IN_CLEAR_PADDING:
> > > +  /* Remember the original type of the argument in an internal
> > > +  dummy second argument, as in GIMPLE pointer conversions are
> > > +  useless.  */
> > > +  return build_call_expr_loc (loc, fndecl, 2, arg0,
> > > +   build_zero_cst (TREE_TYPE (arg0)));
> > > +
> > 
> > I'd rather make this change during gimplify_call_expr, I'm not
> > even sure at which point we'd hit the above (and if at all).
> 
> As discussed on IRC, it was called from the FE folding or could be even from
> gimplify_call_expr which also folds builtins and it wasn't recursing
> because after it got second argument fold_builtin_1 wouldn't be called on it
> anymore (but fold_builtin_2).
> Anyway, I've implemented it now in gimplify_call_expr instead.
> 
> > You're using alias-set zero for all stores but since we're
> > effectively inspecting the layout of a type we could specify
> > that __builtin_clear_padding expects an object of said type
> > at the very location active as dynamic type.  So IMHO
> > less conservative but still sound would be to use
> > build_pointer_type (aggregate-type) in place of ptr_type_node.
> > 
> > I'm not sure it will make a difference but it might be useful
> > to parametrize 'ptr_type_node' in the IL generation?
> 
> Ok, added alias_type to the data structure which is passed around,
> and initialized it to pointer to the toplevel type (except for VLAs,
> in that case just to pointer to the VLA element type).
> 
> Here is the updated patch.

LGTM.

Thanks,
Richard.

> 
> 2020-11-20  Jakub Jelinek  
> 
>   PR libstdc++/88101
> gcc/
>   * builtins.def (BUILT_IN_CLEAR_PADDING): New built-in function.
>   * gimplify.c (gimplify_call_expr): Rewrite single argument
>   BUILT_IN_CLEAR_PADDING into two-argument variant.
>   * gimple-fold.c (clear_padding_unit, clear_padding_buf_size): New
>   const variables.
>   (struct clear_padding_struct): New type.
>   (clear_padding_flush, clear_padding_add_padding,
>   clear_padding_emit_loop, clear_padding_type,
>   clear_padding_union, clear_padding_real_needs_padding_p,
>   clear_padding_type_may_have_padding_p,
>   gimple_fold_builtin_clear_padding): New functions.
>   (gimple_fold_builtin): Handle BUILT_IN_CLEAR_PADDING.
>   * doc/extend.texi (__builtin_clear_padding): Document.
> gcc/c-family/
>   * c-common.c (check_builtin_function_arguments): Handle
>   BUILT_IN_CLEAR_PADDING.
> gcc/testsuite/
>   * c-c++-common/builtin-clear-padding-1.c: New test.
>   * c-c++-common/torture/builtin-clear-padding-1.c: New test.
>   * c-c++-common/torture/builtin-clear-padding-2.c: New test.
>   * c-c++-common/torture/builtin-clear-padding-3.c: New test.
>   * c-c++-common/torture/builtin-clear-padding-4.c: New test.
>   * c-c++-common/torture/builtin-clear-padding-5.c: New test.
>   * g++.dg/torture/builtin-clear-padding-1.C: New test.
>   * g++.dg/torture/builtin-clear-padding-2.C: New test.
>   * gcc.dg/builtin-clear-padding-1.c: New test.
> 
> --- gcc/builtins.def.jj   2020-11-19 20:00:47.116518082 +0100
> +++ gcc/builtins.def  2020-11-20 10:51:44.715684363 +0100
> @@ -839,6 +839,7 @@ DEF_EXT_LIB_BUILTIN(BUILT_IN_CLEAR_C
>  /* [trans-mem]: Adjust BUILT_IN_TM_CALLOC if BUILT_IN_CALLOC is changed.  */
>  DEF_LIB_BUILTIN(BUILT_IN_CALLOC, "calloc", BT_FN_PTR_SIZE_SIZE, 
> ATTR_MALLOC_WARN_UNUSED_RESULT_SIZE_1_2_NOTHROW_LEAF_LIST)
>  DEF_GCC_BUILTIN(BUILT_IN_CLASSIFY_TYPE, "classify_type", 
> BT_FN_INT_VAR, ATTR_LEAF_LIST)
> +DEF_GCC_BUILTIN(BUILT_IN_CLEAR_PADDING, "clear_padding", 
> BT_FN_VOID_VAR, ATTR_NOTHROW_NONNULL_TYPEGENERIC_LEAF)
>  DEF_GCC_BUILTIN(BUILT_IN_CLZ, "clz", BT_FN_INT_UINT, 
> ATTR_CONST_NOTHROW_LEAF_LIST)
>  DEF_GCC_BUILTIN(BUILT_IN_CLZIMAX, "clzimax", BT_FN_INT_UINTMAX, 
> ATTR_CONST_NOTHROW_LEAF_LIST)
>  DEF_GCC_BUILTIN(BUILT_IN_CLZL, "clzl", BT_FN_INT_ULONG, 
> ATTR_CONST_NOTHROW_LEAF_LIST)
> --- gcc/gimplify.c.jj 2020-11-20 08:43:52.262473979 +0100
> +++ gcc/gimplify.c2020-11-20 10:58:37.035125705 +0100
> @@ -3384,6 +3384,20 @@ gimplify_call_expr (tree *expr_p, gimple
>   cfun->calls_eh_return = true;
>   break;
>  
> +  case BUILT_IN_CLEAR_PADDING:
> + if (call_expr_nargs (*expr_p) == 1)
> +   {
> + /* Remember the original type of the argument in an internal
> +dummy second argument, as in GIMPLE pointer conversions are
> +useless.  */
> + p = CALL_EXPR_ARG (*expr_p, 0);
> + *e

RE: [PATCH] Deal with (pattern) SLP consumed stmts in hybrid discovery

2020-11-20 Thread Richard Biener

On Fri, 20 Nov 2020, Tamar Christina wrote:

> Hi Richi,
> 
> > -Original Message-
> > From: rguent...@ryzen.fritz.box  On Behalf Of
> > Richard Biener
> > Sent: Friday, November 20, 2020 9:54 AM
> > To: gcc-patches@gcc.gnu.org
> > Cc: Tamar Christina 
> > Subject: [PATCH] Deal with (pattern) SLP consumed stmts in hybrid discovery
> > 
> > This makes hybrid SLP discovery deal with stmts indirectly consumed by SLP,
> > for example via patterns.  This means that all uses of a stmt end up in SLP
> > vectorized stmts.
> > 
> > This helps my prototype patches for PR97832 where I make SLP discovery re-
> > associate chains to make operands match.  This ends up building SLP
> > computation nodes without 1:1 representatives in the scalar IL and thus no
> > scalar lane defs in SLP_TREE_SCALAR_STMTS.  Nevertheless all of the original
> > scalar stmts are consumed so this represents another kind of SLP pattern for
> > the computation chain result.
> > 
> > Bootstrapped and tested on x86_64-unknown-linux-gnu.
> > 
> > Tamar - can you check if this helps you avoiding all the relevancy push/pop
> > stuff as well as avoiding any pattern marking for SLP patterns at all?
> > 
> 
> Looks like it does allow me to avoid the relevancy and SLP_TYPE markings:
> 
> === vect_detect_hybrid_slp ===
> Processing hybrid candidate : slp_patt_48 = .COMPLEX_MUL (_pa_50, _pa_49);
> Marked SLP consumed stmt pure: slp_patt_48 = .COMPLEX_MUL (_pa_50, _pa_49);
> Processing hybrid candidate : slp_patt_51 = .COMPLEX_MUL (_pa_53, _pa_52);
> Marked SLP consumed stmt pure: slp_patt_51 = .COMPLEX_MUL (_pa_53, _pa_52);
> Processing hybrid candidate : _24 = _9 * _19;
> Marked SLP consumed stmt pure: _24 = _9 * _19;
> Processing hybrid candidate : _23 = _10 * _18;
> Marked SLP consumed stmt pure: _23 = _10 * _18;
> Processing hybrid candidate : _22 = _9 * _18;
> Marked SLP consumed stmt pure: _22 = _9 * _18;
> Processing hybrid candidate : _17 = _10 * _19;
> Marked SLP consumed stmt pure: _17 = _10 * _19;
> 
> And costing looks to be ignoring the unrelated statements
> 
> 0x50bd800 REALPART_EXPR <*_3> 1 times unaligned_load (misalign -1) costs 1 in 
> body
> 0x50bd800 REALPART_EXPR <*_5> 1 times unaligned_load (misalign -1) costs 1 in 
> body
> 0x50bd800 .COMPLEX_MUL (_pa_53, _pa_52) 1 times vector_stmt costs 1 in body
> 0x50bd800 REALPART_EXPR <*_5> 1 times unaligned_load (misalign -1) costs 1 in 
> body
> 0x50bd800  0 times vec_perm costs 0 in body
> 0x50bd800 .COMPLEX_MUL (_pa_50, _pa_49) 1 times vector_stmt costs 1 in body
> 0x50bd800  1 times vec_perm costs 2 in body
> 0x50bd800 _25 1 times unaligned_store (misalign -1) costs 1 in body
> 
> But it doesn't allow me to avoid the pattern markings.
> 
> Without the pattern markings it will crash when analyzing the loads
> 
> note:   ==> examining statement: _26 = REALPART_EXPR <*_5>;
> note:   Vectorizing an unaligned access.
> note:   vect_model_load_cost: unaligned supported by hardware.
> note:   vect_model_load_cost: inside_cost = 1, prologue_cost = 0 .
> 
> as the statement it finds is not being used:

Hmm, I see.  I'll dig into this a bit when you send out the patch.

> cadd.c:22:6: internal compiler error: in vect_slp_analyze_node_operations_1, 
> at tree-vect-slp.c:3591
>22 | void f90 (TYPE complex a[restrict N], TYPE complex b[restrict N], 
> TYPE complex c[restrict N])
>   |  ^~~
> 0x17200e1 vect_slp_analyze_node_operations_1
> /gnu-work/src/gcc/gcc/tree-vect-slp.c:3591
> 0x17209e8 vect_slp_analyze_node_operations
> /gnu-work/src/gcc/gcc/tree-vect-slp.c:3784
> 0x1720995 vect_slp_analyze_node_operations
> /gnu-work/src/gcc/gcc/tree-vect-slp.c:3776
> 0x1720995 vect_slp_analyze_node_operations
> /gnu-work/src/gcc/gcc/tree-vect-slp.c:3776
> 0x1720995 vect_slp_analyze_node_operations
> /gnu-work/src/gcc/gcc/tree-vect-slp.c:3776
> 0x1720995 vect_slp_analyze_node_operations
> /gnu-work/src/gcc/gcc/tree-vect-slp.c:3776
> 0x17212f2 vect_slp_analyze_operations(vec_info*)
> /gnu-work/src/gcc/gcc/tree-vect-slp.c:3971
> 0x16e2c6e vect_analyze_loop_2
> /gnu-work/src/gcc/gcc/tree-vect-loop.c:2419
> 0x16e4bcd vect_analyze_loop(loop*, vec_info_shared*)
> /gnu-work/src/gcc/gcc/tree-vect-loop.c:2956
> 0x173ba4d try_vectorize_loop_1
> /gnu-work/src/gcc/gcc/tree-vectorizer.c:1010
> 0x173c23a try_vectorize_loop
> /gnu-work/src/gcc/gcc/tree-vectorizer.c:1163
> 0x173c470 vectorize_loops()
> /gnu-work/src/gcc/gcc/tree-vectorizer.c:1244
> 0x1579275 execute
> /gnu-work/src/gcc/gcc/tree-ssa-loop.c:414
> 
> I will assume the patch's existence in tree when I send out my COMPLEX_ADD 
> patch today then.

OK, will push it then - I hoped it might help you!

Thanks,
Richard.

> Thanks!,
> Tamar
> 
> > 2020-11-20  Richard Biener  
> > 
> > * tree-vect-slp.c (maybe_push_to_hybrid_worklist): New function.
> > (vect_detect_hybrid_slp): Use it.  Perform a backward walk
> > over the IL.
> > ---
>

[PATCH] dump SLP_TREE_REPRESENTATIVE

2020-11-20 Thread Richard Biener

It always annoyed me to see those empty SLP nodes in dumpfiles:

t.c:16:3: note:   node 0x3a2a280 (max_nunits=1, refcnt=1)
t.c:16:3: note: { }
t.c:16:3: note: children 0x3a29db0 0x3a29e90

resulting from two-operator handling.  The following makes
sure to also dump the operation template or VEC_PERM_EXPR.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2020-11-20  Richard Biener  

* tree-vect-slp.c (vect_print_slp_tree): Also dump
SLP_TREE_REPRESENTATIVE.
---
 gcc/tree-vect-slp.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index d2f2407ac92..f378148516f 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -1908,6 +1908,14 @@ vect_print_slp_tree (dump_flags_t dump_kind, 
dump_location_t loc,
  : ""), node,
   estimated_poly_value (node->max_nunits),
 SLP_TREE_REF_COUNT (node));
+  if (SLP_TREE_DEF_TYPE (node) == vect_internal_def)
+{
+  if (SLP_TREE_CODE (node) == VEC_PERM_EXPR)
+   dump_printf_loc (metadata, user_loc, "op: VEC_PERM_EXPR\n");
+  else
+   dump_printf_loc (metadata, user_loc, "op template: %G",
+SLP_TREE_REPRESENTATIVE (node)->stmt);
+}
   if (SLP_TREE_SCALAR_STMTS (node).exists ())
 FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, stmt_info)
   dump_printf_loc (metadata, user_loc, "\tstmt %u %G", i, stmt_info->stmt);
-- 
2.26.2

Only copare sizes of automatic variables

2020-11-20 Thread Jan Hubicka

Hi,
one of common remaining reasons for ICF to fail after loading in fuction
body is mismatched type of automatic vairable.   This is becuase
compatible_types_p resorts to checking TYPE_MAIN_VARIANTS for
euqivalence that prevents merging many TBAA compaitle cases.  (And thus
is also not reflected by the hash extended by alias sets of accesses.)

Since in gimple
automatic variables are just blocks of memory I think we should only
check its size only. All accesses are matched when copmparing the actual
loads/stores.

I am not sure if we need to match types of other DECLs but I decided I can try
to be safe here: for PARM_DECl/RESUILT_DECL we match them anyway to be sure
that functions are ABI compatible.  For CONST_DECL and readonly global
VAR_DECLs they are matched when comparing their constructors.  So i think
we can keep the compare to be safe and perhaps play with it next stage1.

Bootstrapped/regtested x86_64-linux, OK?

Honza

* ipa-icf-gimple.c (func_checker:compare_decl): Do not compare types
of local variables.
diff --git a/gcc/ipa-icf-gimple.c b/gcc/ipa-icf-gimple.c
index 69bc9ab5b34..67bb2747981 100644
--- a/gcc/ipa-icf-gimple.c
+++ b/gcc/ipa-icf-gimple.c
@@ -153,8 +153,21 @@ func_checker::compare_decl (const_tree t1, const_tree t2)
   && DECL_BY_REFERENCE (t1) != DECL_BY_REFERENCE (t2))
 return return_false_with_msg ("DECL_BY_REFERENCE flags are different");
 
-  if (!compatible_types_p (TREE_TYPE (t1), TREE_TYPE (t2)))
-return return_false ();
+  /* We do not really need to check types of variables, since they are just
+ blocks of memory and we verify types of the accesses to them.
+ However do compare types of other kinds of decls
+ (parm decls and result decl types may affect ABI convetions).  */
+  if (t != VAR_DECL)
+{
+  if (!compatible_types_p (TREE_TYPE (t1), TREE_TYPE (t2)))
+   return return_false ();
+}
+  else
+{
+  if (!operand_equal_p (DECL_SIZE (t1), DECL_SIZE (t2),
+   OEP_MATCH_SIDE_EFFECTS))
+   return return_false_with_msg ("DECL_SIZEs are different");
+}
 
   bool existed_p;
   const_tree &slot = m_decl_map.get_or_insert (t1, &existed_p);

Re: Only copare sizes of automatic variables

2020-11-20 Thread Richard Biener

On Fri, 20 Nov 2020, Jan Hubicka wrote:

> Hi,
> one of common remaining reasons for ICF to fail after loading in fuction
> body is mismatched type of automatic vairable.   This is becuase
> compatible_types_p resorts to checking TYPE_MAIN_VARIANTS for
> euqivalence that prevents merging many TBAA compaitle cases.  (And thus
> is also not reflected by the hash extended by alias sets of accesses.)
> 
> Since in gimple
> automatic variables are just blocks of memory I think we should only
> check its size only. All accesses are matched when copmparing the actual
> loads/stores.
> 
> I am not sure if we need to match types of other DECLs but I decided I can try
> to be safe here: for PARM_DECl/RESUILT_DECL we match them anyway to be sure
> that functions are ABI compatible.  For CONST_DECL and readonly global
> VAR_DECLs they are matched when comparing their constructors.  So i think
> we can keep the compare to be safe and perhaps play with it next stage1.
> 
> Bootstrapped/regtested x86_64-linux, OK?

I suppose we eventually check the types of SSA names?  OK if so.

Richard.

> Honza
> 
>   * ipa-icf-gimple.c (func_checker:compare_decl): Do not compare types
>   of local variables.
> diff --git a/gcc/ipa-icf-gimple.c b/gcc/ipa-icf-gimple.c
> index 69bc9ab5b34..67bb2747981 100644
> --- a/gcc/ipa-icf-gimple.c
> +++ b/gcc/ipa-icf-gimple.c
> @@ -153,8 +153,21 @@ func_checker::compare_decl (const_tree t1, const_tree t2)
>&& DECL_BY_REFERENCE (t1) != DECL_BY_REFERENCE (t2))
>  return return_false_with_msg ("DECL_BY_REFERENCE flags are different");
>  
> -  if (!compatible_types_p (TREE_TYPE (t1), TREE_TYPE (t2)))
> -return return_false ();
> +  /* We do not really need to check types of variables, since they are just
> + blocks of memory and we verify types of the accesses to them.
> + However do compare types of other kinds of decls
> + (parm decls and result decl types may affect ABI convetions).  */
> +  if (t != VAR_DECL)
> +{
> +  if (!compatible_types_p (TREE_TYPE (t1), TREE_TYPE (t2)))
> + return return_false ();
> +}
> +  else
> +{
> +  if (!operand_equal_p (DECL_SIZE (t1), DECL_SIZE (t2),
> + OEP_MATCH_SIDE_EFFECTS))
> + return return_false_with_msg ("DECL_SIZEs are different");
> +}
>  
>bool existed_p;
>const_tree &slot = m_decl_map.get_or_insert (t1, &existed_p);
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imend

Re: Only copare sizes of automatic variables

2020-11-20 Thread Jan Hubicka

> On Fri, 20 Nov 2020, Jan Hubicka wrote:
> 
> > Hi,
> > one of common remaining reasons for ICF to fail after loading in fuction
> > body is mismatched type of automatic vairable.   This is becuase
> > compatible_types_p resorts to checking TYPE_MAIN_VARIANTS for
> > euqivalence that prevents merging many TBAA compaitle cases.  (And thus
> > is also not reflected by the hash extended by alias sets of accesses.)
> > 
> > Since in gimple
> > automatic variables are just blocks of memory I think we should only
> > check its size only. All accesses are matched when copmparing the actual
> > loads/stores.
> > 
> > I am not sure if we need to match types of other DECLs but I decided I can 
> > try
> > to be safe here: for PARM_DECl/RESUILT_DECL we match them anyway to be sure
> > that functions are ABI compatible.  For CONST_DECL and readonly global
> > VAR_DECLs they are matched when comparing their constructors.  So i think
> > we can keep the compare to be safe and perhaps play with it next stage1.
> > 
> > Bootstrapped/regtested x86_64-linux, OK?
> 
> I suppose we eventually check the types of SSA names?  OK if so.

Yes, SSA names are checked as part of operand_equal_p.  So VAR_DECLs of
register variables are safe too (as are SSA_NAMES with no associated
VAR_DECL)

Honza

[PATCH]middle-end vect: Have vectorizable_slp_permutation set type on invariants

2020-11-20 Thread Tamar Christina via Gcc-patches

Hi All,

This modifies vectorizable_slp_permutation to update the type of the children
of a perm node before trying to permute them.  This allows us to be able to
permute invariant nodes.

This will be covered by test from the SLP pattern matcher.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

* tree-vect-slp.c (vectorizable_slp_permutation): Update types on nodes
when needed.

--- inline copy of patch -- 
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 
d2f2407ac92f23724387e893a24c6661b514dafb..92af9cb698a70ae899f31da01415253782ada5d8
 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -5114,7 +5114,8 @@ vectorizable_slp_permutation (vec_info *vinfo, 
gimple_stmt_iterator *gsi,
   slp_tree child;
   unsigned i;
   FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
-if (!types_compatible_p (SLP_TREE_VECTYPE (child), vectype))
+if (!vect_maybe_update_slp_op_vectype (child, vectype)
+   || !types_compatible_p (SLP_TREE_VECTYPE (child), vectype))
   {
if (dump_enabled_p ())
  dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,


-- 
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index d2f2407ac92f23724387e893a24c6661b514dafb..92af9cb698a70ae899f31da01415253782ada5d8 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -5114,7 +5114,8 @@ vectorizable_slp_permutation (vec_info *vinfo, gimple_stmt_iterator *gsi,
   slp_tree child;
   unsigned i;
   FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
-if (!types_compatible_p (SLP_TREE_VECTYPE (child), vectype))
+if (!vect_maybe_update_slp_op_vectype (child, vectype)
+	|| !types_compatible_p (SLP_TREE_VECTYPE (child), vectype))
   {
 	if (dump_enabled_p ())
 	  dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,

Re: [PATCH]middle-end vect: Have vectorizable_slp_permutation set type on invariants

2020-11-20 Thread Richard Biener

On Fri, 20 Nov 2020, Tamar Christina wrote:

> Hi All,
> 
> This modifies vectorizable_slp_permutation to update the type of the children
> of a perm node before trying to permute them.  This allows us to be able to
> permute invariant nodes.
> 
> This will be covered by test from the SLP pattern matcher.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?

OK.

Thanks,
Richard..

> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   * tree-vect-slp.c (vectorizable_slp_permutation): Update types on nodes
>   when needed.
> 
> --- inline copy of patch -- 
> diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
> index 
> d2f2407ac92f23724387e893a24c6661b514dafb..92af9cb698a70ae899f31da01415253782ada5d8
>  100644
> --- a/gcc/tree-vect-slp.c
> +++ b/gcc/tree-vect-slp.c
> @@ -5114,7 +5114,8 @@ vectorizable_slp_permutation (vec_info *vinfo, 
> gimple_stmt_iterator *gsi,
>slp_tree child;
>unsigned i;
>FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
> -if (!types_compatible_p (SLP_TREE_VECTYPE (child), vectype))
> +if (!vect_maybe_update_slp_op_vectype (child, vectype)
> + || !types_compatible_p (SLP_TREE_VECTYPE (child), vectype))
>{
>   if (dump_enabled_p ())
> dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> 
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imend

[committed] libstdc++: Remove dependency from [PR 92546]

2020-11-20 Thread Jonathan Wakely via Gcc-patches

Unlike the other headers that declare alias templates in namespace pmr,
 includes . That was done because the
pmr::string::const_iterator typedef requires pmr::string to be complete,
which requires pmr::polymorphic_allocator to be complete.

By using __normal_iterator instead of the
const_iterator typedef we can avoid the completeness requirement.

This makes  smaller, by not requiring  and its
 dependency, which depends on .  Backporting this
will also help with PR 97876, where  ends up being needed by
 via .

libstdc++-v3/ChangeLog:

PR libstdc++/92546
* include/std/regex (pmr::smatch, pmr::wsmatch): Declare using
underlying __normal_iterator type, not nested typedef
basic_string::const_iterator.

No new test, because 28_regex/match_results/pmr_typedefs.cc already
checks these typedefs are correct.

Tested x86_64-linux. Committed to trunk.

commit 640ebeb336050887cb57417b7568279c588088f0
Author: Jonathan Wakely 
Date:   Fri Nov 20 11:30:33 2020

libstdc++: Remove  dependency from  [PR 92546]

Unlike the other headers that declare alias templates in namespace pmr,
 includes . That was done because the
pmr::string::const_iterator typedef requires pmr::string to be complete,
which requires pmr::polymorphic_allocator to be complete.

By using __normal_iterator instead of the
const_iterator typedef we can avoid the completeness requirement.

This makes  smaller, by not requiring  and its
 dependency, which depends on .  Backporting this
will also help with PR 97876, where  ends up being needed by
 via .

libstdc++-v3/ChangeLog:

PR libstdc++/92546
* include/std/regex (pmr::smatch, pmr::wsmatch): Declare using
underlying __normal_iterator type, not nested typedef
basic_string::const_iterator.

diff --git a/libstdc++-v3/include/std/regex b/libstdc++-v3/include/std/regex
index 43ee1aee6162..783f5f131a67 100644
--- a/libstdc++-v3/include/std/regex
+++ b/libstdc++-v3/include/std/regex
@@ -64,21 +64,25 @@
 #include 
 
 #if __cplusplus >= 201703L && _GLIBCXX_USE_CXX11_ABI
-#include 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
-  namespace pmr {
+  namespace pmr
+  {
 template class polymorphic_allocator;
 template
   using match_results
= std::match_results<_BidirectionalIterator, polymorphic_allocator<
sub_match<_BidirectionalIterator>>>;
-using cmatch  = match_results;
-using smatch  = match_results;
+using cmatch = match_results;
+// Use __normal_iterator directly, because pmr::string::const_iterator
+// would require pmr::polymorphic_allocator to be complete.
+using smatch
+  = match_results<__gnu_cxx::__normal_iterator>;
 #ifdef _GLIBCXX_USE_WCHAR_T
 using wcmatch = match_results;
-using wsmatch = match_results;
+using wsmatch
+  = match_results<__gnu_cxx::__normal_iterator>;
 #endif
   } // namespace pmr
 _GLIBCXX_END_NAMESPACE_VERSION

Re: [PATCH] libstdc++: Ensure __gthread_self doesn't call undefined weak symbol [PR 95989]

2020-11-20 Thread Jonathan Wakely via Gcc-patches


On 19/11/20 21:42 +, Jonathan Wakely wrote:

On 12/11/20 17:34 +, Jonathan Wakely wrote:

On 11/11/20 19:08 +0100, Jakub Jelinek via Libstdc++ wrote:

On Wed, Nov 11, 2020 at 05:24:42PM +, Jonathan Wakely wrote:

--- a/libgcc/gthr-posix.h
+++ b/libgcc/gthr-posix.h
@@ -684,7 +684,14 @@ __gthread_equal (__gthread_t __t1, __gthread_t __t2)
static inline __gthread_t
__gthread_self (void)
{
+#if __GLIBC_PREREQ(2, 27)


What if it is a non-glibc system where __GLIBC_PREREQ macro isn't defined?
I think you'd get then
error: missing binary operator before token "("
So I think you want
#if defined __GLIBC__ && defined __GLIBC_PREREQ
#if __GLIBC_PREREQ(2, 27)
return pthread_self ();
#else
return __gthrw_(pthread_self) ();
#else
return __gthrw_(pthread_self) ();
#endif
or similar.



Here's a fixed version of the patch.

I've moved the glibc-specific code in this_thread::get_id() into a new
macro defined in config/os/gnu-linux/os_defines.h (where we already
know we are dealing with glibc). That means we don't do the
__GLIBC_PREREQ check directly in , it's hidden away in a
target-specific header.

Tested powerpc64le-linux (glibc 2.17 and 2.32), sparc-solaris2.11 and
powerpc-aix.


I've committed this version which only fixes this_thread::get_id() in
libstdc++, and doesn't change __gthread_self in gthr-posix.h

Due to a recent change to replace other uses of __gthread_self with
calls to this_thread::get_id(), fixing it there fixes all uses in
libstdc++.


Here's the backport for gcc-10, where we still use __gthread_self in
two places in .

Tested x86_64-linux, committed to gcc-10 branch.



commit b11c74b0e357652a1f1f0d937455310a6389
Author: Jonathan Wakely 
Date:   Thu Nov 19 21:07:06 2020

libstdc++: Avoid calling undefined __gthread_self weak symbol [PR 95989]

Since glibc 2.27 the pthread_self symbol has been defined in libc rather
than libpthread. Because we only call pthread_self through a weak alias
it's possible for statically linked executables to end up without a
definition of pthread_self. This crashes when trying to call an
undefined weak symbol.

We can use the __GLIBC_PREREQ version check to detect the version of
glibc where pthread_self is no longer in libpthread, and call it
directly rather than through the weak reference.

It would be better to check for pthread_self in libc during configure
instead of hardcoding the __GLIBC_PREREQ check. That would be
complicated by the fact that prior to glibc 2.27 libc.a didn't have the
pthread_self symbol, but libc.so.6 did.  The configure checks would need
to try to link both statically and dynamically, and the result would
depend on whether the static libc.a happens to be installed during
configure (which could vary between different systems using the same
version of glibc). Doing it properly is left for a future date, as that
will be needed anyway after glibc moves all pthread symbols from
libpthread to libc. When that happens we should revisit the whole
approach of using weak symbols for pthread symbols.

For the purposes of std::this_thread::get_id() we call
pthread_self() directly when using glibc 2.27 or later. Otherwise, if
__gthread_active_p() is true then we know the libpthread symbol is
available so we call that. Otherwise, we are single-threaded and just
use ((__gthread_t)1) as the thread ID.

An undesirable consequence of this change is that code compiled prior to
the change might inline the old definition of this_thread::get_id()
which always returns (__gthread_t)1 in a program that isn't linked to
libpthread. Code compiled after the change will use pthread_self() and
so get a real TID. That could result in the main thread having different
thread::id values in different translation units. This seems acceptable,
as there are not expected to be many uses of thread::id in programs
that aren't linked to libpthread.

An earlier version of this patch also changed __gthread_self() to use
__GLIBC_PREREQ(2, 27) and only use the weak symbol for older glibc. Tha
might still make sense to do, but isn't needed by libstdc++ now.

libstdc++-v3/ChangeLog:

PR libstdc++/95989
* config/os/gnu-linux/os_defines.h (_GLIBCXX_NATIVE_THREAD_ID):
Define new macro to get reliable thread ID.
* include/std/stop_token (_Stop_state_t::_M_request_stop):
Use new macro if it's defined.
(_Stop_state_t::_M_remove_callback): Likewise.
* include/std/thread (this_thread::get_id): Likewise.
* testsuite/30_threads/jthread/95989.cc: New test.
* testsuite/30_threads/this_thread/95989.cc: New test.

(cherry picked from commit 08b4d325711d5c6f68ac29443aba3fd7aa173ac8)

diff --git a/libstdc++-v3/config/os/gnu-linux/os_defines.h b/libstdc++-v3/config/os/gnu-linux/os_defines.h
index f821486

[PATCH] libstdc++: Fix compilation error with clang-8 [PR 97876]

2020-11-20 Thread Jonathan Wakely via Gcc-patches

This fixes a compilation error with clang-8 and earlier. This change is
only on the gcc-10 branch, not master, because the  header
is included indirectly in more places on the branch than on master.

PR libstdc++/97876
* include/std/stop_token (_Stop_state_t): Define default
constructor as user-provided not defaulted.

Tested x86_64-linux, committed to gcc-10 branch *only*.

commit a186d72afd6cfb13efd4a0ec82049d79892334fd
Author: Jonathan Wakely 
Date:   Thu Nov 19 22:32:54 2020

libstdc++: Fix compilation error with clang-8 [PR 97876]

This fixes a compilation error with clang-8 and earlier. This change is
only on the gcc-10 branch, not master, because the  header
is included indirectly in more places on the branch than on master.

PR libstdc++/97876
* include/std/stop_token (_Stop_state_t): Define default
constructor as user-provided not defaulted.

diff --git a/libstdc++-v3/include/std/stop_token 
b/libstdc++-v3/include/std/stop_token
index 76709dd59ebd..80f50ea83ca9 100644
--- a/libstdc++-v3/include/std/stop_token
+++ b/libstdc++-v3/include/std/stop_token
@@ -166,7 +166,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   __gthread_t _M_requester;
 #endif
 
-  _Stop_state_t() = default;
+  _Stop_state_t() noexcept { }
 
   bool
   _M_stop_possible() noexcept

Re: [PATCH v2] Add if-chain to switch conversion pass.

2020-11-20 Thread Richard Biener via Gcc-patches

On Fri, Nov 20, 2020 at 9:57 AM Martin Liška  wrote:
>
> On 11/19/20 3:46 PM, Richard Biener wrote:
> > OK, so can you send an updated patch?

+  tree pos_one = build_int_cst (type, 1);
+  if (!left->m_has_forward_bb
+ && !right->m_has_forward_bb
+ && left->m_case_bb == right->m_case_bb)
+   {
+ tree next = int_const_binop (PLUS_EXPR, left->get_high (), pos_one);
+ if (tree_int_cst_equal (next, right->get_low ()))

can you please avoid tree arithmetic here and use wide_ints?

  if (wi::eq (wi::to_wide (right->get_low) - wi::to_wide
(left->get_high), wi::one (TYPE_PRECISION (type))

?

+  info.record_phi_mapping (info.m_true_edge,
+  &info.m_true_edge_phi_mapping);
+  info.record_phi_mapping (info.m_false_edge,
+  &info.m_false_edge_phi_mapping);

you still have this edge mapping stuff, can't you recover the actual
PHI arguments from the PHIs during code generation?  I see you removed
the restriction for all-same values, good.

+unsigned int
+pass_if_to_switch::execute (function *fun)
+{
+  auto_vec all_candidates;
+  hash_map conditions_in_bbs;
+
+  basic_block bb;
+  FOR_EACH_BB_FN (bb, fun)
+find_conditions (bb, &conditions_in_bbs);
+

if we didn't find any suitable conditions we can early out

+  free_dominance_info (CDI_DOMINATORS);
+
+  if (!all_candidates.is_empty ())
+mark_virtual_operands_for_renaming (fun);

please do not free dominator info when you did nothing
(all_candidates.is_empty ()).

+ gcond *cond = chain->m_entries[0]->m_cond;
+ expanded_location loc = expand_location (gimple_location (cond));
+ if (dump_file)
+   {
+ fprintf (dump_file, "Condition chain (at %s:%d) with %d BBs "
+  "transformed into a switch statement.\n",
+  loc.file, loc.line,
+  chain->m_entries.length ());
+   }

if you use dump_enabled_p () and dump_printf_loc you can
use 'cond' as location itself.

Otherwise looks OK.

Thanks,
Richard.

> Sure.
>
> Martin

Re: [PATCH] Check calls before loop unrolling

2020-11-20 Thread David Edelsohn via Gcc-patches

On Fri, Nov 20, 2020 at 2:48 AM Richard Biener
 wrote:
>
> On Fri, Nov 20, 2020 at 12:58 AM Segher Boessenkool
>  wrote:
> >
> > On Thu, Nov 19, 2020 at 03:30:37PM -0700, Jeff Law wrote:
> > > > No, the vast majority of people will *not* (consciously) use them,
> > > > because the target defaults will set things to useful values.
> > > >
> > > > The compiler could use saner "generic" defaults perhaps, but those will
> > > > still not be satisfactory for anyone (except when they aren't generic in
> > > > fact but instead tuned for one arch ;-) ) -- unrolling is just too
> > > > important for performance.
> > > Then fix the heuristics, don't add new PARAMS :-)
> >
> > I just said that cannot work?
> >
> > > It didn't even occur to me until now that you may be pushing to have the
> > > ppc backend have different values for the PARAMS.  I would strongly
> > > discourage that.  It's been a huge headache in the s390 backend already.
> >
> > It also makes a huge performance difference.  That the generic parts
> > of GCC are only tuned for x86 (or not well tuned for anything?) is a
> > huge roadblock for us.
> >
> > I am not saying we should have six hundred different tunings.  But we
> > need a few (and we already *have* a few, not params but generic flags,
> > just like many other targets fwiw).
> >
> > We *do* have a few custom param settings already, just like aarch64,
> > ia64, and sh, actually.
> >
> > > >> In  my mind fixing things so they work with no magic arguments is best.
> > > >> PARAMS are the worst solution.  A -f flag with no arguments is 
> > > >> somewhere
> > > >> in between.  Others may clearly have different opinions here.
> > > > There is no big difference between params and flags here, IMO -- it has
> > > > to be a -f with a value as well, for good results.
> > > Which is a signal that we have a deeper problem.  -f with a value is no
> > > different than a param.
> >
> > Yes exactly.
> >
> > > > Since we have (almost) all such tunings in --param already, I'd say this
> > > > one belongs there as well?
> > > I'm not convinced at this point.
> >
> > Why not?
> >
> > We have way many params, yes.
>
> --params were introduced to avoid "magic numbers" in code and at the
> same time not overwhelm users with many -f options.  That they are
> runtime-controllable was probably done because we could and because
> it's nice for GCC developers.

GCC historically has not done a good job at loop unrolling.  And
tuning of loop unrolling is inherently architecture- and
microarchitecture-specific.  There is no "better heuristic".  There is
nothing inherent in the instruction stream to determine an optimal
unrolling that is correct for x86 and AArch64 and RISC-V and s390x and
Power.

Based on what academic literature or experience of other compilers
have you determined that this limitation can be addressed with "fix
the heuristics"?

The patch *IS* trying to fix the heuristics.  The heuristics require
additional, processor-specific information.  And a parameter is the
natural mechanism in GCC to provide a numerical value to adjust a
heuristic.

As Richard wrote, the GCC community chose to collect the "magic
numbers" in a centralized table with a consistent interface that can
be overridden by individual ports.  The ability to override the
parameters on the command line was for convenience and ease of
development.  It's not meant as a value that any end-user normally
will adjust.  But there is no reason to arbitrarily start a campaign
against parameters.

GCC supports a large number of targets, and that requires the ability
to adjust the optimization and transformation behavior of the compiler
to achieve the best performance on a wide variety of processors.  We
would appreciate it if you would not block a patch that improves the
code generation of GCC on Power (and other targets) because of an
aesthetic concern about too many parameters in GCC.  That ship has
sailed.

Thanks, David

>
> >  But the first step to counteract that
> > would be to deprecate and get rid of many existing ones, not to block
> > having new ones which can be useful (while many of the existing ones are
> > not).
>
> Not sure about this - sure, if heuristic can be simplified to use N < M
> (previous) "magic" numbers that's better.  But if "deprecating" just
> involves pasting the current --param default literally into the heuristcs
> then no, please not.
>
> For this particular patch the question is if the heuristic is sound,
> not the particular magic number.  And I have no opinion about this
> (being this is the RTL unroller).
>
> Richard.
>
> >
> > Or, we could accept that it is not really a problem at all.  You seem to
> > have a strong opinion that it *is*, but I don't understand that; maybe
> > you can explain a bit more?
> >
> > Thanks,
> >
> >
> > Segher

Re: [PATCH] [PR target/97726] arm: [testsuite] fix some simd tests on armbe

2020-11-20 Thread Andrea Corallo via Gcc-patches

Kyrylo Tkachov  writes:

>> -Original Message-
>> From: Andrea Corallo 
>> Sent: 16 November 2020 16:11
>> To: Andrea Corallo via Gcc-patches 
>> Cc: nd ; Richard Earnshaw ;
>> Kyrylo Tkachov 
>> Subject: [PATCH] [PR target/97726] arm: [testsuite] fix some simd tests on
>> armbe
>> 
>> Andrea Corallo via Gcc-patches  writes:
>> 
>> > Hi all,
>> >
>> > I'd like to submit this patch to fix three testcases reported to be
>> > failing on arm big endian on PR target/97727.
>> >
>> > Okay for trunk?
>> >
>> > Thanks
>> >
>> >   Andrea
>> 
>> Ops I got the PR number wrong, target/97726 is the correct one.
>> 
>> Attached the updated patch+changelog.
>> 
>> Sorry for the trouble.
>
> Ok.
> Thanks,
> Kyrill
>

Hi Kyrill,

installed into trunk as 86706296b7e.

Thanks

  Andrea

doc: Fixup a couple of formatting nits

2020-11-20 Thread Nathan Sidwell



I noticed a couple of places we used @code{program} instead of
@command{program}.

gcc/
* doc/invoke.texi: Replace a couple of @code with @command

pushing to trunk
--
Nathan Sidwell
diff --git c/gcc/doc/invoke.texi w/gcc/doc/invoke.texi
index 07232c6b33d..29ae36861ad 100644
--- c/gcc/doc/invoke.texi
+++ w/gcc/doc/invoke.texi
@@ -97,7 +97,7 @@ The usual way to run GCC is to run the executable called @command{gcc}, or
 When you compile C++ programs, you should invoke GCC as @command{g++} 
 instead.  @xref{Invoking G++,,Compiling C++ Programs}, 
 for information about the differences in behavior between @command{gcc} 
-and @code{g++} when compiling C++ programs.
+and @command{g++} when compiling C++ programs.
 
 @cindex grouping options
 @cindex options, grouping
@@ -14352,7 +14422,7 @@ Note that it is quite common that execution counts of some part of
 programs depends, for example, on length of temporary file names or
 memory space randomization (that may affect hash-table collision rate).
 Such non-reproducible part of programs may be annotated by
-@code{no_instrument_function} function attribute. @code{gcov-dump} with
+@code{no_instrument_function} function attribute. @command{gcov-dump} with
 @option{-l} can be used to dump gathered data and verify that they are
 indeed reproducible.

RE: [PATCH] [PR target/97727] aarch64: [testcase] fix bf16_vstN_lane_2.c for big endian targets

2020-11-20 Thread Kyrylo Tkachov via Gcc-patches




> -Original Message-
> From: Andrea Corallo 
> Sent: 09 November 2020 18:24
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov ; Richard Earnshaw
> ; nd ;
> christophe.l...@linaro.org
> Subject: [PATCH] [PR target/97727] aarch64: [testcase] fix
> bf16_vstN_lane_2.c for big endian targets
> 
> Hi all,
> 
> this simple patch is to fix PR target/97727.
> 
> Okay for trunk and gcc-10?
> 

Ok.
Thanks,
Kyrill

> Thanks!
> 
>   Andrea
> 
> 2020-11-09  Andrea Corallo  
> 
>   PR target/97727
>   * gcc.target/aarch64/advsimd-intrinsics/bf16_vldN_lane_2.c: Relax
>   regexps.
>

Modules doc

2020-11-20 Thread Nathan Sidwell

Here is an update c++ modules documentation patch.  I'd be grateful for 
review.  Especially checking I'm not using too much implementor-speak


nathan
--
Nathan Sidwell
diff --git c/gcc/doc/cppopts.texi w/gcc/doc/cppopts.texi
index 7f1849d841f..e5ece92487b 100644
--- c/gcc/doc/cppopts.texi
+++ w/gcc/doc/cppopts.texi
@@ -139,6 +139,10 @@ this useless.
 
 This feature is used in automatic updating of makefiles.
 
+@item -Mno-modules
+@opindex Mno-modules
+Disable dependency generation for compiled module interfaces.
+
 @item -MP
 @opindex MP
 This option instructs CPP to add a phony target for each dependency
diff --git c/gcc/doc/invoke.texi w/gcc/doc/invoke.texi
index 02abac39de8..29ae36861ad 100644
--- c/gcc/doc/invoke.texi
+++ w/gcc/doc/invoke.texi
@@ -172,6 +172,7 @@ listing and explanation of the binary and decimal byte size prefixes.
 * Spec Files::  How to pass switches to sub-processes.
 * Environment Variables:: Env vars that affect GCC.
 * Precompiled Headers:: Compiling a header once, and using it many times.
+* C++ Modules::		Experimental C++20 module system.
 @end menu
 
 @c man begin OPTIONS
@@ -214,14 +215,21 @@ in the following sections.
 -faligned-new=@var{n}  -fargs-in-order=@var{n}  -fchar8_t  -fcheck-new @gol
 -fconstexpr-depth=@var{n}  -fconstexpr-cache-depth=@var{n} @gol
 -fconstexpr-loop-limit=@var{n}  -fconstexpr-ops-limit=@var{n} @gol
+-fmodule-header@r{[}=@var{kind}@r{]} -fmodule-only -fmodules-ts @gol
+-fmodule-implicit-inline @gol
+-fmodule-mapper=@var{specification} @gol
+-fmodule-version-ignore @gol
 -fno-elide-constructors @gol
 -fno-enforce-eh-specs @gol
 -fno-gnu-keywords @gol
 -fno-implicit-templates @gol
 -fno-implicit-inline-templates @gol
--fno-implement-inlines  -fms-extensions @gol
+-fno-implement-inlines  @gol
+-fno-module-lazy @gol
+-fms-extensions @gol
 -fnew-inheriting-ctors @gol
 -fnew-ttp-matching @gol
+-fno-module-lazy @gol
 -fno-nonansi-builtins  -fnothrow-opt  -fno-operator-names @gol
 -fno-optional-diags  -fpermissive @gol
 -fno-pretty-templates @gol
@@ -233,12 +241,14 @@ in the following sections.
 -fvisibility-inlines-hidden @gol
 -fvisibility-ms-compat @gol
 -fext-numeric-literals @gol
+-flang-info-include-translate@r{[}=@var{name}@r{]} @gol
 -Wabi-tag  -Wcatch-value  -Wcatch-value=@var{n} @gol
 -Wno-class-conversion  -Wclass-memaccess @gol
 -Wcomma-subscript  -Wconditionally-supported @gol
 -Wno-conversion-null  -Wctad-maybe-unsupported @gol
 -Wctor-dtor-privacy  -Wno-delete-incomplete @gol
--Wdelete-non-virtual-dtor  -Wdeprecated-copy  -Wdeprecated-copy-dtor @gol
+-Wdelete-non-virtual-dtor  -Wdeprecated-copy -Wdeprecated-copy-dtor @gol
+-Winvalid-imported-macros @gol
 -Wno-deprecated-enum-enum-conversion -Wno-deprecated-enum-float-conversion @gol
 -Weffc++  -Wno-exceptions -Wextra-semi  -Wno-inaccessible-base @gol
 -Wno-inherited-variadic-ctor  -Wno-init-list-lifetime @gol
@@ -599,7 +609,7 @@ Objective-C and Objective-C++ Dialects}.
 -fpreprocessed  -ftabstop=@var{width}  -ftrack-macro-expansion  @gol
 -fwide-exec-charset=@var{charset}  -fworking-directory @gol
 -H  -imacros @var{file}  -include @var{file} @gol
--M  -MD  -MF  -MG  -MM  -MMD  -MP  -MQ  -MT @gol
+-M  -MD  -MF  -MG  -MM  -MMD  -MP  -Mno-modules -MQ  -MT @gol
 -no-integrated-cpp  -P  -pthread  -remap @gol
 -traditional  -traditional-cpp  -trigraphs @gol
 -U@var{macro}  -undef  @gol
@@ -1571,7 +1581,7 @@ name suffix).  This option applies to all following input files until
 the next @option{-x} option.  Possible values for @var{language} are:
 @smallexample
 c  c-header  cpp-output
-c++  c++-header  c++-cpp-output
+c++  c++-header  c++-system-header c++-user-header c++-cpp-output
 objective-c  objective-c-header  objective-c-cpp-output
 objective-c++ objective-c++-header objective-c++-cpp-output
 assembler  assembler-with-cpp
@@ -3056,6 +3066,53 @@ To save space, do not emit out-of-line copies of inline functions
 controlled by @code{#pragma implementation}.  This causes linker
 errors if these functions are not inlined everywhere they are called.
 
+@item -fmodules-ts
+@itemx -fno-modules-ts
+@opindex fmodules-ts
+@opindex fno-modules-ts
+Enable support for C++ 20 modules.  The @option{-fno-modules-ts} is
+usually not needed, as that is the default.  Even though this is a
+C++20 feature, it is not currently implicitly enabled by selecting
+that standard version.
+
+@item -fmodule-header
+@itemx -fmodule-header=user
+@itemx -fmodule-header=system
+@opindex fmodule-header
+Compile as a header unit.
+
+@item -fmodule-implicit-inline
+@opindex fmodule-implicit-inline
+Memmber functions defined in their class definitions are not
+implicitly inline for modular code.  This is different to traditional
+C++ behaviour, for good reasons.  However, it may result in a
+difficulty during code porting.  This option will make such function
+definitions implicitly inline.  It does however generate an ABI
+incompatibility, so you must use it everywhere or nowhere.  (Such
+definitions outside of a name

Re: [PATCH] Check calls before loop unrolling

2020-11-20 Thread Jan Hubicka

> On Thu, Nov 19, 2020 at 03:30:37PM -0700, Jeff Law wrote:
> > > No, the vast majority of people will *not* (consciously) use them,
> > > because the target defaults will set things to useful values.
> > >
> > > The compiler could use saner "generic" defaults perhaps, but those will
> > > still not be satisfactory for anyone (except when they aren't generic in
> > > fact but instead tuned for one arch ;-) ) -- unrolling is just too
> > > important for performance.
> > Then fix the heuristics, don't add new PARAMS :-)
> 
> I just said that cannot work?
> 
> > It didn't even occur to me until now that you may be pushing to have the
> > ppc backend have different values for the PARAMS.  I would strongly
> > discourage that.  It's been a huge headache in the s390 backend already.
> 
> It also makes a huge performance difference.  That the generic parts
> of GCC are only tuned for x86 (or not well tuned for anything?) is a
> huge roadblock for us.

As you know I spend quite some time on inliner heuristics but even after
the years I have no clear idea how the requirements differs from x86-64
to ppc, arm and s390.  Clearly compared to x86_64 prologues may get more
expensive on ppc/arm because of more registers (so we should inline less
to cold code) and function calls are more expensive (so we sould inline
more to hot code). We do have PR for that in testusite where most of
them I looked through.

Problem is that each of us has different metodology - different
bechmarks to look at and different opinions on what is good for O2 and
O3.  From long term maintenace POV I am worried about changing a lot of
--param defaults in different backends simply becuase the meaning of
those values keeps changing (as early opts improve; we get better on
tracking optimizations during IPA passes; and our focus shift from C
with sane inlines to basic C++ to heavy templatized C++ with many broken
inline hints to heavy C++ with lto).

For this reason I tend to preffer to not tweak in taret specific ways
unless there is very clear evidence to do so just because I think I will
not be able to maintain code quality testing in future.

It would be very interesting to set up testing that could let us compare
basic arches side to side to different defaults. Our LNT testing does
good job for x86-64 but we have basically zero coverage publically
available on other targets and it is very hard to get inliner relevant
banchmarks (where SPEC is not the best choice) done in comparable way on
multiple arches.

Honza

Re: [PATCH] gcov: Add __gcov_info_to_gdca()

2020-11-20 Thread Martin Liška


On 11/20/20 11:11 AM, Sebastian Huber wrote:

On 20/11/2020 10:49, Martin Liška wrote:


On 11/20/20 10:25 AM, Sebastian Huber wrote:

On 20/11/2020 09:37, Martin Liška wrote:


On 11/17/20 10:57 AM, Sebastian Huber wrote:

This is a proposal to get the gcda data for a gcda info in a free-standing
environment.  It is intended to be used with the -fprofile-info-section option.
A crude test program which doesn't use a linker script is:


Hello.

I'm not pretty sure how this set up is going to work. Can you please explain me 
that?

I was thinking about your needs and I can imagine various techniques how to 
generate
gcda files format:

1) embedded system can override fopen, fwrite, fseek to a functions that do a 
remote
write-related functions

Yes, this is one option, however, the inhibit_libc disables quite a lot of 
libgcov functionality if Newlib is used for example.


I see. Btw do you have available Newlib in the embedded environment? If so, 
what I/O functionality is provided?

Yes, I use Newlib with the RTEMS real-time operating system. Newlib provides 
the standard C library I/O functions (fopen, etc.). However, having Newlib 
available doesn't mean that every application uses its. Applications are 
statically linked with the operating system and Newlib. They only use what is 
required. Some applications cannot use the standard C library I/O since they 
use a lot of infrastructure and memory. You can do a lot of things with just a 
couple of KiBs available.


I see.





2) - use -fprofile-info-section
   - run an app on an embedded system and do a memory dump to a terminal/console
   - take the memory dump to a host system (with IO), run 
__gcov_init_from_memory_dump (...)
 and then do a normal __gcov_dump


I am not sure if a plain memory dump really simplifies things. You have to get 
the filename separately since it is only referenced in gcov_info and not 
included in the structure:

struct gcov_info
{
[...]
   const char *filename;        /* output file name */
[...]
#ifndef IN_GCOV_TOOL
   const struct gcov_fn_info *const *functions; /* pointer to pointers
   to function information  */
[...]
#endif /* !IN_GCOV_TOOL */
};


I see!



Also the gcov_fn_info is not embedded in the gcov_info structure. If you do a 
plain memory dump, then you dump also pointers and how do you deal with these 
pointers on the host? You would need some extra information to describe the 
memory dump. So, why not use the gcda format for this? It is also more compact 
since zero value counters are skipped. Serial lines are slow, so less data to 
transfer is good.

/* Convert the gcov information to a gcda data stream.  The first callback is
    called exactly once with the filename associated with the gcov information.
    The filename may be NULL.  Afterwards, the second callback is subsequently
    called with chunks (the begin and length of the chunk are passed as the
    first two arguments) of the gcda data stream.  The fourth parameter is a
    user-provided argument passed as the last argument to the callback
    functions.  */

extern void __gcov_info_to_gcda (const struct gcov_info *gi_ptr,
              void (*filename) (const char *name, void *arg),
              void (*dump) (const void *begin, unsigned size, void *arg),

              void *arg);

If __gcov_info_to_gcda() is correctly implemented, then this should give you 
directly gcda files if you use something like this:

#include 
#include 

extern const struct gcov_info *__gcov_info_start[];
extern const struct gcov_info *__gcov_info_end[];

static void
filename (const char *f, void *arg)
{
   FILE **file = arg;
   *file = fopen(f, "rb");
}

static void
dump (const void *d, unsigned n, void *arg)
{
   FILE **file = arg;
   fwrite(d, n, 1, *file);
}

static void
dump_gcov_info (void)
{
   const struct gcov_info **info = __gcov_info_start;
   const struct gcov_info **end = __gcov_info_end;

   /* Obfuscate variable to prevent compiler optimizations.  */
   __asm__ ("" : "+r" (end));

   while (info != end)
   {
 FILE *file = NULL;
 __gcov_info_to_gcda (*info, filename, dump, &file);
 fclose(file);
 ++info;
   }
}

int
main()
{
   dump_gcov_info();
   return 0;
}

The callback functions give the user the full control how the data of the gcda 
file is encoded for the transfer to a host. No gcov internals are exposed.



All right. Btw. how will you implement these 2 callbacks on the embedded target?


One options is to convert the gcov info to YAML:

gcov-info:

- file: filename1

   data: <... base64 encoded data from __gcov_info_to_gcda ... >

- file: filename2

   data: ...

Then send the data to the host via a serial line. On the host read the data, 
parse the YAML, and create the gcda files. The __gcov_info_to_gcda() needs 
about 408 bytes of ARM Thumb-2 code and no data. You need a polled character 
output function, the linker set iteration and two callbacks. So, you c

Re: [PATCH 01/31] PR target/58901: reload: Handle SUBREG of MEM with a mode-dependent address

2020-11-20 Thread Maciej W. Rozycki

On Fri, 20 Nov 2020, Eric Botcazou wrote:

> > gcc/
> > PR target/58901
> > * reload.c (reload_inner_reg_of_subreg): Also request reloading
> > for pseudo registers associated with mode dependent memory
> > references.
> > (push_reload): Handle pseudo registers.
> 
> The handling of this family of reloads is supposed to be done by the block of 
> code just above though, i.e. at line 1023.  Can't we add the test based on 
> mode_dependent_address_p to this block, e.g. after:
> 
> || (REG_P (SUBREG_REG (in))
> && REGNO (SUBREG_REG (in)) < FIRST_PSEUDO_REGISTER
> && !REG_CAN_CHANGE_MODE_P (REGNO (SUBREG_REG (in)),
>GET_MODE (SUBREG_REG 
> (in)), inmode
> 
> instead?

 Thank you for your input, I'll have a look.  Coming from Matt this is the 
only change of the series I have just merged without looking into it too 
much, so as not to spend too much time with side issues (there were too 
many already).

 It'll take me a couple of days to push the new version through regression 
testing, and I'll post it once that is complete along with any other 
updates someone may request.

  Maciej

Re: [PATCH] [PR target/97727] aarch64: [testcase] fix bf16_vstN_lane_2.c for big endian targets

2020-11-20 Thread Andrea Corallo via Gcc-patches

Kyrylo Tkachov  writes:

>> -Original Message-
>> From: Andrea Corallo 
>> Sent: 09 November 2020 18:24
>> To: gcc-patches@gcc.gnu.org
>> Cc: Kyrylo Tkachov ; Richard Earnshaw
>> ; nd ;
>> christophe.l...@linaro.org
>> Subject: [PATCH] [PR target/97727] aarch64: [testcase] fix
>> bf16_vstN_lane_2.c for big endian targets
>> 
>> Hi all,
>> 
>> this simple patch is to fix PR target/97727.
>> 
>> Okay for trunk and gcc-10?
>> 
>
> Ok.
> Thanks,
> Kyrill

Into master and gcc-10 respectivelly as f671b3d79fe and 48b21baa8c7.

Thanks!

  Andrea

[PATCH] Power10: Add missing IEEE 128-bit XSCMP* built-in mappings.

2020-11-20 Thread Michael Meissner via Gcc-patches

Power10: Add missing IEEE 128-bit XSCMP* built-in mappings.

This patch is a simplification of earlier patches to fix the built-in functions
that introduced new power10 IEEE 128-bit instructions.  Some of the built-in
functions were already handled, but the scalar_cmp_exp_qp_gt, etc. functions
were not handled.  This shows up in the float128-cmp2-runnable.c test when long
double uses the IEEE 128-bit representation.

I had done the previous patches fairly quickly, forgetting about the switch
inside of rs6000_expand_builtin in rs6000-call.c that switches between KF and
TF built-in functions without having to add overloaded function names.  This
patch uses that simpler method.

The previous patches were at:

Date: Thu, 24 Sep 2020 16:42:59 -0400
Subject: [PATCH 7/9] PowerPC: Update IEEE 128-bit built-in functions to work if 
long double is IEEE 128-bit.
Message-ID: <20200924204259.gg31...@ibm-toto.the-meissners.org>

Date: Thu, 22 Oct 2020 18:03:46 -0400
Subject: PowerPC: Map IEEE 128-bit long double built-in functions
Message-ID: <2020100346.ga8...@ibm-toto.the-meissners.org>

I have built two sets of bootstrap compilers on a little endian power9 server
system running Linux.  One compiler used the default IBM IEEE 128-bit long
double support and the other used IEEE 128-bit for the long double.  This patch
fixes the failure in the float128-cmp2-runnable.c test, and it adds no other
regressions.  Can I check this patch into the master branch?


2020-11-18  Michael Meissner  

* config/rs6000/rs6000-call.c (rs6000_expand_builtin): Add missing
XSCMP* cases for IEEE 128-bit long double.
---
 gcc/config/rs6000/rs6000-call.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 8294e22fb85..1fdb39f15c0 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -12803,6 +12803,22 @@ rs6000_expand_builtin (tree exp, rtx target, rtx 
subtarget ATTRIBUTE_UNUSED,
   case CODE_FOR_xsiexpqp_kf:   icode = CODE_FOR_xsiexpqp_tf;   break;
   case CODE_FOR_xsiexpqpf_kf:  icode = CODE_FOR_xsiexpqpf_tf;  break;
   case CODE_FOR_xststdcqp_kf:  icode = CODE_FOR_xststdcqp_tf;  break;
+
+  case CODE_FOR_xscmpexpqp_eq_kf:
+   icode = CODE_FOR_xscmpexpqp_eq_tf;
+   break;
+
+  case CODE_FOR_xscmpexpqp_lt_kf:
+   icode = CODE_FOR_xscmpexpqp_lt_tf;
+   break;
+
+  case CODE_FOR_xscmpexpqp_gt_kf:
+   icode = CODE_FOR_xscmpexpqp_gt_tf;
+   break;
+
+  case CODE_FOR_xscmpexpqp_unordered_kf:
+   icode = CODE_FOR_xscmpexpqp_unordered_tf;
+   break;
   }
 
   if (TARGET_DEBUG_BUILTIN)
-- 
2.22.0


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797

re: FAIL: gcc.dg/pr97515.c

2020-11-20 Thread Andrew MacLeod via Gcc-patches


On 11/19/20 10:56 PM, sunil.k.pandey wrote:

On Linux/x86_64,

d0d8b5d83614d8f0d0e40c0520d4f40ffa01f8d9 is the first bad commit
commit d0d8b5d83614d8f0d0e40c0520d4f40ffa01f8d9
Author: Andrew MacLeod 
Date:   Thu Nov 19 17:41:30 2020 -0500

 Process only valid shift ranges.

caused

FAIL: gcc.dg/pr97515.c scan-tree-dump-times evrp "goto" 1

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-5185/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=gcc.dg/pr97515.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=gcc.dg/pr97515.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=gcc.dg/pr97515.c 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=gcc.dg/pr97515.c 
--target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)

huh, that slipped by on my regressions run...  obfuscated by all the 
various "tree-prof" unresolved errors that always seem to be happening.


Anyway.

so the problem here appears to be that when we had
  [-1,-1] >> VARYING
This use to overflow in the cross product code and just return VARYING.
now that its being masked to
 [-1,-1] >> [0,31]

so in the end, the rshift cross product code decides the result is [-1, 
-1]  instead of varying.


THis ends up changing the calculations, and we realized certain other 
parts of the program are unreachable sooner, and the undefined value is 
cant be usied in constant propagation, so we dont fold away a couple of 
phis..

..

I added a comment to describing why this is happening and adjusted the 
testcase for now to check in CCP that everything folded away.


pushed an adjustment to the testcase to check in CCP2 for now that 
everything folded away.


Andrew






commit 65854626304d50cf348af53de1c29ccec06d33c6
Author: Andrew MacLeod 
Date:   Fri Nov 20 10:37:26 2020 -0500

re: FAIL: gcc.dg/pr97515.c

Adjust testcase to check in CCP not EVRP.

gcc/testuite/
* gcc.dg/pr97515.c: Check in ccp2, not evrp.

diff --git a/gcc/testsuite/gcc.dg/pr97515.c b/gcc/testsuite/gcc.dg/pr97515.c
index 84f145a261f..b4f2481cb03 100644
--- a/gcc/testsuite/gcc.dg/pr97515.c
+++ b/gcc/testsuite/gcc.dg/pr97515.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-evrp" } */
+/* { dg-options "-O2 -fdump-tree-ccp2" } */
 
 int
 e7 (int gg)
@@ -20,6 +20,8 @@ e7 (int gg)
   return xe;
 }
 
-/* EVRP should be able to reduce this to a single goto.  */
+/* EVRP should be able to reduce this to a single goto when we can
+ * revisit statements to try folding again based on changed inputs.
+ * Until then, make sure its gone by ccp2.  */
  
-/* { dg-final { scan-tree-dump-times "goto" 1 "evrp" } } */
+/* { dg-final { scan-tree-dump-times "goto" 1 "ccp2" } } */

Re: [PATCH] gcov: Add __gcov_info_to_gdca()

2020-11-20 Thread Sebastian Huber


On 20/11/2020 16:25, Martin Liška wrote:

Apart from these 2 hooks, I bet you will also need gcov_position and 
gcov_seek functions,

can be seen in my sent patch.

For what do I need them?



I prefer the way with the 2 extra hooks.
Can you please prepare a patch where the newly added functions 
__gcov_info_to_gcda and __gcov_fn_info_to_gcda
will be used in libgcov (with the hooks equal to fopen and fwrite? 


I am not really sure what I should do. Do you mean that write_one_data() 
should be rewritten to use __gcov_info_to_gcda() with hooks that use 
gcov_write_unsigned()?


The write_one_data() also has a const struct gcov_summary *prg_p 
pointer. What should an external user provide for this pointer? For 
example &gi_ptr->summary?


The write_one_data() has this code

  if (fn_buffer && fn_buffer->fn_ix == f_ix)
    {
  /* Buffered data from another program.  */
  buffered = 1;
  gfi_ptr = &fn_buffer->info;
  length = GCOV_TAG_FUNCTION_LENGTH;
    }

which uses a global variable

/* buffer for the fn_data from another program.  */
static struct gcov_fn_buffer *fn_buffer;

For this handling we would need a new hook to do this:

  if (buffered)
    fn_buffer = free_fn_data (gi_ptr, fn_buffer, GCOV_COUNTERS);

I don't know for what we need seek and position hooks.

--
embedded brains GmbH
Sebastian HUBER
Dornierstr. 4
82178 Puchheim
Germany
email: sebastian.hu...@embedded-brains.de
Phone: +49-89-18 94 741 - 16
Fax:   +49-89-18 94 741 - 08
PGP: Public key available on request.

embedded brains GmbH
Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier: 
https://embedded-brains.de/datenschutzerklaerung/

PDP endian bitfields

2020-11-20 Thread Jakub Jelinek via Gcc-patches

On Mon, Nov 16, 2020 at 01:50:20PM -0500, Paul Koning wrote:
> > On Nov 16, 2020, at 6:57 AM, Jakub Jelinek via Gcc-patches 
> >  wrote:
> > Working virtually out of Baker Island - AoE timezone.
> > 
> > The following patch implements __builtin_clear_padding builtin that clears
> > the padding bits in object representation (but preserves value
> > representation).  Inside of unions it clears only those padding bits that
> > are padding for all the union members (so that it never alters value
> > representation).
> > 
> > It handles trailing padding, padding in the middle of structs including
> > bitfields (PDP11 unhandled, I've never figured out how those bitfields
> > work), etc.
> 
> That reminds me of a similar comment on a commit a while ago.  I'd like to
> take care of this, but I'm not sure what questions need to be answered to
> do so.  Can you point me in the right direction?

There are now many of them.
grep -C3 '\(BYTES\|WORDS\)_BIG_ENDIAN [!=]= \(BYTES\|WORDS\)_BIG_ENDIAN' 
gcc/{,*/}*.{c,h,cc}
will show various cases (ignore the ones mentioning FLOAT_*).

Some are just optimizations, so they can be ignored, others in features not
really supported on pdp11 anyway (e.g. asan), but e.g. this
__builtin_clear_padding I've just checked in today, or the
__builtin_bit_cast that is pending review need extra work for PDP11.
And if you can figure out what to do with the optimizations too, e.g. sccvn
can now on big end little endian but not pdp endian handle propagation of
constants through memory including bitfields.  Or e.g.
store merging too.

In the __builtin_clear_padding case, basically all that is needed is find
out which bits in the target memory order are padding bits and which are
occupied by bitfields.  The code has the FIELD_DECL and needs to set certain
bits in the target memory image to 0 for bits in the bitfield and keep other
bits set if they are parts of other bitfields or padding.
For big and little endian, the code uses int_byte_position (field)
to find the starting byte,
tree_to_uhwi (DECL_FIELD_BIT_OFFSET (field)) % BITS_PER_UNIT
to determine how many (perhaps padding) bits are before that bitfield
in the first byte and TYPE_PRECISION (TREE_TYPE (field)) to determine
the bitsize of the bitfield.
I have no idea what needs to be done for PDP11 endian, the testsuite already
includes testcases that could cover it, or new ones similar to the existing
ones can be added if there are other special cases that need to be checked
(such as e.g. char : N bitfields if they are handled differently etc.).

Jakub

Nested declare target support

2020-11-20 Thread Kwok Cheung Yeung


Hello


New OpenMP 5.0 features that won't be available in GCC 9, are planned for GCC 10
or later versions as time permits:


...

- nested declare target support


You said in an email two years ago that nested declare target was not supported 
yet. I do not see any patches that claim to implement this since then, but when 
I ran a quick test with a trunk build:


#pragma omp declare target
  #pragma omp declare target
int foo() { return 1; }
  #pragma omp end declare target
  int bar() { return 2; }
#pragma omp end declare target

This compiles and appears to do the right thing:

__attribute__((omp declare target, omp declare target block))
foo ()
...

__attribute__((omp declare target, omp declare target block))
bar ()
...

Looking at the C parser:

static void
c_parser_omp_declare_target (c_parser *parser)
{
  ...
  else
{
  c_parser_skip_to_pragma_eol (parser);
  current_omp_declare_target_attribute++;
  return;
}

static void
c_parser_omp_end_declare_target (c_parser *parser)
{
  ...
current_omp_declare_target_attribute--;
}

It looks like this was written to handle nesting to begin with (since at least 
2013) by making current_omp_declare_target_attribute (which effectively tracks 
the nesting level) an integer. Is there anything that is currently missing for 
nested declare target support?


Thanks

Kwok

Re: Modules doc

2020-11-20 Thread Marek Polacek via Gcc-patches

On Fri, Nov 20, 2020 at 10:19:55AM -0500, Nathan Sidwell wrote:
> Here is an update c++ modules documentation patch.  I'd be grateful for
> review.  Especially checking I'm not using too much implementor-speak
> 
> nathan
> -- 
> Nathan Sidwell

> diff --git c/gcc/doc/cppopts.texi w/gcc/doc/cppopts.texi
> index 7f1849d841f..e5ece92487b 100644
> --- c/gcc/doc/cppopts.texi
> +++ w/gcc/doc/cppopts.texi
> @@ -139,6 +139,10 @@ this useless.
>  
>  This feature is used in automatic updating of makefiles.
>  
> +@item -Mno-modules
> +@opindex Mno-modules
> +Disable dependency generation for compiled module interfaces.
> +
>  @item -MP
>  @opindex MP
>  This option instructs CPP to add a phony target for each dependency
> diff --git c/gcc/doc/invoke.texi w/gcc/doc/invoke.texi
> index 02abac39de8..29ae36861ad 100644
> --- c/gcc/doc/invoke.texi
> +++ w/gcc/doc/invoke.texi
> @@ -172,6 +172,7 @@ listing and explanation of the binary and decimal byte 
> size prefixes.
>  * Spec Files::  How to pass switches to sub-processes.
>  * Environment Variables:: Env vars that affect GCC.
>  * Precompiled Headers:: Compiling a header once, and using it many times.
> +* C++ Modules::  Experimental C++20 module system.
>  @end menu
>  
>  @c man begin OPTIONS
> @@ -214,14 +215,21 @@ in the following sections.
>  -faligned-new=@var{n}  -fargs-in-order=@var{n}  -fchar8_t  -fcheck-new @gol
>  -fconstexpr-depth=@var{n}  -fconstexpr-cache-depth=@var{n} @gol
>  -fconstexpr-loop-limit=@var{n}  -fconstexpr-ops-limit=@var{n} @gol
> +-fmodule-header@r{[}=@var{kind}@r{]} -fmodule-only -fmodules-ts @gol
> +-fmodule-implicit-inline @gol
> +-fmodule-mapper=@var{specification} @gol
> +-fmodule-version-ignore @gol
>  -fno-elide-constructors @gol
>  -fno-enforce-eh-specs @gol
>  -fno-gnu-keywords @gol
>  -fno-implicit-templates @gol
>  -fno-implicit-inline-templates @gol
> --fno-implement-inlines  -fms-extensions @gol
> +-fno-implement-inlines  @gol
> +-fno-module-lazy @gol
> +-fms-extensions @gol
>  -fnew-inheriting-ctors @gol
>  -fnew-ttp-matching @gol
> +-fno-module-lazy @gol
>  -fno-nonansi-builtins  -fnothrow-opt  -fno-operator-names @gol
>  -fno-optional-diags  -fpermissive @gol
>  -fno-pretty-templates @gol
> @@ -233,12 +241,14 @@ in the following sections.
>  -fvisibility-inlines-hidden @gol
>  -fvisibility-ms-compat @gol
>  -fext-numeric-literals @gol
> +-flang-info-include-translate@r{[}=@var{name}@r{]} @gol
>  -Wabi-tag  -Wcatch-value  -Wcatch-value=@var{n} @gol
>  -Wno-class-conversion  -Wclass-memaccess @gol
>  -Wcomma-subscript  -Wconditionally-supported @gol
>  -Wno-conversion-null  -Wctad-maybe-unsupported @gol
>  -Wctor-dtor-privacy  -Wno-delete-incomplete @gol
> --Wdelete-non-virtual-dtor  -Wdeprecated-copy  -Wdeprecated-copy-dtor @gol
> +-Wdelete-non-virtual-dtor  -Wdeprecated-copy -Wdeprecated-copy-dtor @gol
> +-Winvalid-imported-macros @gol
>  -Wno-deprecated-enum-enum-conversion -Wno-deprecated-enum-float-conversion 
> @gol
>  -Weffc++  -Wno-exceptions -Wextra-semi  -Wno-inaccessible-base @gol
>  -Wno-inherited-variadic-ctor  -Wno-init-list-lifetime @gol
> @@ -599,7 +609,7 @@ Objective-C and Objective-C++ Dialects}.
>  -fpreprocessed  -ftabstop=@var{width}  -ftrack-macro-expansion  @gol
>  -fwide-exec-charset=@var{charset}  -fworking-directory @gol
>  -H  -imacros @var{file}  -include @var{file} @gol
> --M  -MD  -MF  -MG  -MM  -MMD  -MP  -MQ  -MT @gol
> +-M  -MD  -MF  -MG  -MM  -MMD  -MP  -Mno-modules -MQ  -MT @gol
>  -no-integrated-cpp  -P  -pthread  -remap @gol
>  -traditional  -traditional-cpp  -trigraphs @gol
>  -U@var{macro}  -undef  @gol
> @@ -1571,7 +1581,7 @@ name suffix).  This option applies to all following 
> input files until
>  the next @option{-x} option.  Possible values for @var{language} are:
>  @smallexample
>  c  c-header  cpp-output
> -c++  c++-header  c++-cpp-output
> +c++  c++-header  c++-system-header c++-user-header c++-cpp-output
>  objective-c  objective-c-header  objective-c-cpp-output
>  objective-c++ objective-c++-header objective-c++-cpp-output
>  assembler  assembler-with-cpp
> @@ -3056,6 +3066,53 @@ To save space, do not emit out-of-line copies of 
> inline functions
>  controlled by @code{#pragma implementation}.  This causes linker
>  errors if these functions are not inlined everywhere they are called.
>  
> +@item -fmodules-ts
> +@itemx -fno-modules-ts
> +@opindex fmodules-ts
> +@opindex fno-modules-ts
> +Enable support for C++ 20 modules.  The @option{-fno-modules-ts} is

We should be consistent wrt "C++ 20" and "C++20", so let's go with the
latter?

> +usually not needed, as that is the default.  Even though this is a
> +C++20 feature, it is not currently implicitly enabled by selecting
> +that standard version.
> +
> +@item -fmodule-header
> +@itemx -fmodule-header=user
> +@itemx -fmodule-header=system
> +@opindex fmodule-header
> +Compile as a header unit.

Not sure if everyone knows what a header unit is.

> +
> +@item -fmodule-implicit-inline
> +@opindex fmodule-i

Re: Modules doc

2020-11-20 Thread Nathan Sidwell

thanks for taking a look, I hope this is better -- I add a forward 
reference from -fmodules-ts option description, so as to not have to 
explain C++ terms of art just there :)


nathan
--
Nathan Sidwell
diff --git c/gcc/doc/cppopts.texi w/gcc/doc/cppopts.texi
index 7f1849d841f..e5ece92487b 100644
--- c/gcc/doc/cppopts.texi
+++ w/gcc/doc/cppopts.texi
@@ -139,6 +139,10 @@ this useless.
 
 This feature is used in automatic updating of makefiles.
 
+@item -Mno-modules
+@opindex Mno-modules
+Disable dependency generation for compiled module interfaces.
+
 @item -MP
 @opindex MP
 This option instructs CPP to add a phony target for each dependency
diff --git c/gcc/doc/invoke.texi w/gcc/doc/invoke.texi
index 02abac39de8..1857e16e475 100644
--- c/gcc/doc/invoke.texi
+++ w/gcc/doc/invoke.texi
@@ -172,6 +172,7 @@ listing and explanation of the binary and decimal byte size prefixes.
 * Spec Files::  How to pass switches to sub-processes.
 * Environment Variables:: Env vars that affect GCC.
 * Precompiled Headers:: Compiling a header once, and using it many times.
+* C++ Modules::		Experimental C++20 module system.
 @end menu
 
 @c man begin OPTIONS
@@ -214,14 +215,21 @@ in the following sections.
 -faligned-new=@var{n}  -fargs-in-order=@var{n}  -fchar8_t  -fcheck-new @gol
 -fconstexpr-depth=@var{n}  -fconstexpr-cache-depth=@var{n} @gol
 -fconstexpr-loop-limit=@var{n}  -fconstexpr-ops-limit=@var{n} @gol
+-fmodule-header@r{[}=@var{kind}@r{]} -fmodule-only -fmodules-ts @gol
+-fmodule-implicit-inline @gol
+-fmodule-mapper=@var{specification} @gol
+-fmodule-version-ignore @gol
 -fno-elide-constructors @gol
 -fno-enforce-eh-specs @gol
 -fno-gnu-keywords @gol
 -fno-implicit-templates @gol
 -fno-implicit-inline-templates @gol
--fno-implement-inlines  -fms-extensions @gol
+-fno-implement-inlines  @gol
+-fno-module-lazy @gol
+-fms-extensions @gol
 -fnew-inheriting-ctors @gol
 -fnew-ttp-matching @gol
+-fno-module-lazy @gol
 -fno-nonansi-builtins  -fnothrow-opt  -fno-operator-names @gol
 -fno-optional-diags  -fpermissive @gol
 -fno-pretty-templates @gol
@@ -233,12 +241,14 @@ in the following sections.
 -fvisibility-inlines-hidden @gol
 -fvisibility-ms-compat @gol
 -fext-numeric-literals @gol
+-flang-info-include-translate@r{[}=@var{name}@r{]} @gol
 -Wabi-tag  -Wcatch-value  -Wcatch-value=@var{n} @gol
 -Wno-class-conversion  -Wclass-memaccess @gol
 -Wcomma-subscript  -Wconditionally-supported @gol
 -Wno-conversion-null  -Wctad-maybe-unsupported @gol
 -Wctor-dtor-privacy  -Wno-delete-incomplete @gol
--Wdelete-non-virtual-dtor  -Wdeprecated-copy  -Wdeprecated-copy-dtor @gol
+-Wdelete-non-virtual-dtor  -Wdeprecated-copy -Wdeprecated-copy-dtor @gol
+-Winvalid-imported-macros @gol
 -Wno-deprecated-enum-enum-conversion -Wno-deprecated-enum-float-conversion @gol
 -Weffc++  -Wno-exceptions -Wextra-semi  -Wno-inaccessible-base @gol
 -Wno-inherited-variadic-ctor  -Wno-init-list-lifetime @gol
@@ -599,7 +609,7 @@ Objective-C and Objective-C++ Dialects}.
 -fpreprocessed  -ftabstop=@var{width}  -ftrack-macro-expansion  @gol
 -fwide-exec-charset=@var{charset}  -fworking-directory @gol
 -H  -imacros @var{file}  -include @var{file} @gol
--M  -MD  -MF  -MG  -MM  -MMD  -MP  -MQ  -MT @gol
+-M  -MD  -MF  -MG  -MM  -MMD  -MP  -Mno-modules -MQ  -MT @gol
 -no-integrated-cpp  -P  -pthread  -remap @gol
 -traditional  -traditional-cpp  -trigraphs @gol
 -U@var{macro}  -undef  @gol
@@ -1571,7 +1581,7 @@ name suffix).  This option applies to all following input files until
 the next @option{-x} option.  Possible values for @var{language} are:
 @smallexample
 c  c-header  cpp-output
-c++  c++-header  c++-cpp-output
+c++  c++-header  c++-system-header c++-user-header c++-cpp-output
 objective-c  objective-c-header  objective-c-cpp-output
 objective-c++ objective-c++-header objective-c++-cpp-output
 assembler  assembler-with-cpp
@@ -3056,6 +3066,52 @@ To save space, do not emit out-of-line copies of inline functions
 controlled by @code{#pragma implementation}.  This causes linker
 errors if these functions are not inlined everywhere they are called.
 
+@item -fmodules-ts
+@itemx -fno-modules-ts
+@opindex fmodules-ts
+@opindex fno-modules-ts
+Enable support for C++20 modules (@xref{C++ Modules}).  The
+@option{-fno-modules-ts} is usually not needed, as that is the
+default.  Even though this is a C++20 feature, it is not currently
+implicitly enabled by selecting that standard version.
+
+@item -fmodule-header
+@itemx -fmodule-header=user
+@itemx -fmodule-header=system
+@opindex fmodule-header
+Compile a header file to create an importableheader unit.
+
+@item -fmodule-implicit-inline
+@opindex fmodule-implicit-inline
+Member functions defined in their class definitions are not implicitly
+inline for modular code.  This is different to traditional C++
+behavior, for good reasons.  However, it may result in a difficulty
+during code porting.  This option will make such function definitions
+implicitly inline.  It does however generate an ABI incompatibili

Re: Update [PATCH 6/X] libsanitizer: Add hwasan pass and associated gimple changes

2020-11-20 Thread Matthew Malcomson via Gcc-patches


Updates after latest review.
(testing underway)

---
There are four main features to this change:

1) Check pointer tags match address tags.

When sanitizing for hwasan we now put HWASAN_CHECK internal functions before
memory accesses in the `asan` pass.  This checks that a tag in the pointer
being used match the tag stored in shadow memory for the memory region being
used.

These internal functions are expanded into actual checks in the sanopt
pass that happens just before expansion into RTL.

We use the same mechanism that currently inserts ASAN_CHECK internal
functions to insert the new HWASAN_CHECK functions.

2) Instrument known builtin function calls.

Handle all builtin functions that we know use memory accesses.
This commit uses the machinery added for ASAN to identify builtin
functions that access memory.

The main differences between the approaches for HWASAN and ASAN are:
 - libhwasan intercepts much less builtin functions.
 - Alloca needs to be transformed differently (instead of adding
   redzones it needs to tag shadow memory and return a tagged pointer).
 - stack_restore needs to untag the shadow stack between the current
   position and where it's going.
 - `noreturn` functions can not be handled by simply unpoisoning the
   entire shadow stack -- there is no "always valid" tag.
   (exceptions and things such as longjmp need to be handled in a
   different way, usually in the runtime).

For hardware implemented checking (such as AArch64's memory tagging
extension) alloca and stack_restore will need to be handled by hooks in
the backend rather than transformation at the gimple level.  This will
allow architecture specific handling of such stack modifications.

3) Introduce HWASAN block-scope poisoning

Here we use exactly the same mechanism as ASAN_MARK to poison/unpoison
variables on entry/exit of a block.

In order to simply use the exact same machinery we're using the same
internal functions until the SANOPT pass.  This means that all handling
of ASAN_MARK is the same.
This has the negative that the naming may be a little confusing, but a
positive that handling of the internal function doesn't have to be
duplicated for a function that behaves exactly the same but has a
different name.

gcc/ChangeLog:

* asan.c (asan_instrument_reads): New.
(asan_instrument_writes): New.
(asan_memintrin): New.
(handle_builtin_stack_restore): Account for HWASAN.
(handle_builtin_alloca): Account for HWASAN.
(get_mem_refs_of_builtin_call): Special case strlen for HWASAN.
(hwasan_instrument_reads): New.
(hwasan_instrument_writes): New.
(hwasan_memintrin): New.
(report_error_func): Assert not HWASAN.
(build_check_stmt): Make HWASAN_CHECK instead of ASAN_CHECK.
(instrument_derefs): HWASAN does not tag globals.
(instrument_builtin_call): Use new helper functions.
(maybe_instrument_call): Don't instrument `noreturn` functions.
(initialize_sanitizer_builtins): Add new type.
(asan_expand_mark_ifn): Account for HWASAN.
(asan_expand_check_ifn): Assert never called by HWASAN.
(asan_expand_poison_ifn): Account for HWASAN.
(asan_instrument): Branch based on whether using HWASAN or ASAN.
(pass_asan::gate): Return true if sanitizing HWASAN.
(pass_asan_O0::gate): Return true if sanitizing HWASAN.
(hwasan_check_func): New.
(hwasan_expand_check_ifn): New.
(hwasan_expand_mark_ifn): New.
(gate_hwasan): New.
* asan.h (hwasan_expand_check_ifn): New decl.
(hwasan_expand_mark_ifn): New decl.
(gate_hwasan): New decl.
(asan_intercepted_p): Always false for hwasan.
(asan_sanitize_use_after_scope): Account for HWASAN.
* builtin-types.def (BT_FN_PTR_CONST_PTR_UINT8): New.
* gimple-fold.c (gimple_build): New overload for building function
calls without arguments.
(gimple_build_round_up): New.
* gimple-fold.h (gimple_build): New decl.
(gimple_build): New inline function.
(gimple_build_round_up): New decl.
(gimple_build_round_up): New inline function.
* gimple-pretty-print.c (dump_gimple_call_args): Account for
HWASAN.
* gimplify.c (asan_poison_variable): Account for HWASAN.
(gimplify_function_tree): Remove requirement of
SANITIZE_ADDRESS, requiring asan or hwasan is accounted for in
`asan_sanitize_use_after_scope`.
* internal-fn.c (expand_HWASAN_CHECK): New.
(expand_HWASAN_ALLOCA_UNPOISON): New.
(expand_HWASAN_CHOOSE_TAG): New.
(expand_HWASAN_MARK): New.
(expand_HWASAN_SET_TAG): New.
* internal-fn.def (HWASAN_ALLOCA_UNPOISON): New.
(HWASAN_CHOOSE_TAG): New.
(HWASAN_CHECK): New.
(HWASAN_MARK): New.
(HWASAN_SET_TAG): New.
* sanitizer.def (BUILT_IN_HWASAN_LOAD1): New.
(BUILT_IN_HWASAN_

Re: Update [PATCH 6/X] libsanitizer: Add hwasan pass and associated gimple changes

2020-11-20 Thread Richard Sandiford via Gcc-patches

Matthew Malcomson  writes:
> @@ -7877,6 +7903,26 @@ gimple_build_vector (gimple_seq *seq, location_t loc,
>return builder->build ();
>  }
>  
> +/* Emit gimple statements into &stmts that take a value given in `old_size`
> +   and generate a value guaranteed to be rounded upwards to `align`.
> +
> +   Return the tree node representing this size, it is of TREE_TYPE `type`.  
> */

Nit, but: the usual way of referring to parameter names is to use caps
(OLD_SIZE, ALIGN, TYPE) rather than backticks.  I don't think it's
necessary to change the hwasan-specific code to follow that style,
since what you have is self-consistent and readable as-is (although
changing it would be fine too if you prefer).  But since the surrounding
code consistently follows the caps style, I think it would be better to
use it here too.

OK with that change, thanks.

> +
> +tree
> +gimple_build_round_up (gimple_seq *seq, location_t loc, tree type,
> +tree old_size, unsigned HOST_WIDE_INT align)
> +{
> +  unsigned HOST_WIDE_INT tg_mask = align - 1;
> +  /* tree new_size = (old_size + tg_mask) & ~tg_mask;  */
> +  gcc_assert (INTEGRAL_TYPE_P (type));
> +  tree tree_mask = build_int_cst (type, tg_mask);
> +  tree oversize = gimple_build (seq, loc, PLUS_EXPR, type, old_size,
> + tree_mask);
> +
> +  tree mask = build_int_cst (type, -align);
> +  return gimple_build (seq, loc, BIT_AND_EXPR, type, oversize, mask);
> +}
> +
>  /* Return true if the result of assignment STMT is known to be non-negative.
> If the return value is based on the assumption that signed overflow is
> undefined, set *STRICT_OVERFLOW_P to true; otherwise, don't change

Richard

Re: [PATCH] Check calls before loop unrolling

2020-11-20 Thread Segher Boessenkool

Hi!

On Fri, Nov 20, 2020 at 04:22:47PM +0100, Jan Hubicka wrote:
> As you know I spend quite some time on inliner heuristics but even after
> the years I have no clear idea how the requirements differs from x86-64
> to ppc, arm and s390.  Clearly compared to x86_64 prologues may get more
> expensive on ppc/arm because of more registers (so we should inline less
> to cold code) and function calls are more expensive (so we sould inline
> more to hot code). We do have PR for that in testusite where most of
> them I looked through.

I made -fshrink-wrap-separate to make prologues less expensive for stuff
that is only used on the cold paths.  This matters a lot -- and much
more could be done there, but that requires changing the generated code,
not just reordering it, so it is harder to do.

Prologues (and epilogues) are only expensive if they are only needed for
cold code, in a hot function.

> Problem is that each of us has different metodology - different
> bechmarks to look at

This is a good thing often as well, it increases our total coverage.
But if not everything sees all results that also hurts :-/

> and different opinions on what is good for O2 and
> O3.

Yeah.  The documentation for -O3 merely says "Optimize yet more.", but
that is no guidance at all: why would a user ever use -O2 then?

I always understood it as "-O2 is always faster than -O1, but -O3 is not
always faster than -O2".  Aka "-O2 is always a good choice, and -O3 is a
an even better choice for *some* code, but that needs testing per case".

In at least that understanding, and also to battle inflation in general,
we probably should move some things from -O3 to -O2.

> From long term maintenace POV I am worried about changing a lot of
> --param defaults in different backends

Me too.  But changing a few key ones is just too important for
performance :-/

> simply becuase the meaning of
> those values keeps changing (as early opts improve; we get better on
> tracking optimizations during IPA passes; and our focus shift from C
> with sane inlines to basic C++ to heavy templatized C++ with many broken
> inline hints to heavy C++ with lto).

I don't like if targets start to differ too much (in what generic passes
effectively do), no matter what.  It's just not maintainable.

> For this reason I tend to preffer to not tweak in taret specific ways
> unless there is very clear evidence to do so just because I think I will
> not be able to maintain code quality testing in future.

Yes, completely agreed.  But that exception is important :-)

> It would be very interesting to set up testing that could let us compare
> basic arches side to side to different defaults. Our LNT testing does
> good job for x86-64 but we have basically zero coverage publically
> available on other targets and it is very hard to get inliner relevant
> banchmarks (where SPEC is not the best choice) done in comparable way on
> multiple arches.

We cannot help with that on the cfarm, unless we get dedicated hardware
for such benchmarking (and I am not holding my breath for that, getting
good coverage at all is hard enough).  So you probably need to get such
support for every arch separately, elsewhere :-/

Segher

Re: Document --with-build-config=bootstrap-asan option.

2020-11-20 Thread Matthew Malcomson via Gcc-patches


On 13/01/2020 10:40, Matthew Malcomson wrote:

On 11/01/2020 07:19, Gerald Pfeifer wrote:

On Thu, 12 Dec 2019, Matthew Malcomson wrote:

gcc/ChangeLog:

2019-12-12  Matthew Malcomson  

* doc/install.texi: Document bootstrap-asan configuration option.


I see this introduces a new table.


+Some examples of build configurations designed for developers of GCC are:


@samp{bootstrap-time}, @samp{bootstrap-debug-ckovw} and others appear
to fall into the same camp, essentially expected to be used by maintainers
only.

Would it make sense to add your new option to the existing table, or
perhaps see which other options from the existing table to move into
your new one?  Thoughts?


Sounds good me.





The patch is okay modulo the question above.



Hi,

Apologies for bringing up something from this far back, but I've just 
noticed I never actually committed this patch.


Did the above line mean approval once the option was moved into the 
existing table?  Or was it just explaining that the position was the 
only problem you saw?


In other words: is the patch I proposed below Ok for trunk?

Thanks,
Matthew



Thanks,
Gerald



Patch with above suggestion.

#

diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index
80b47812fe66a8ef50edf3aad9708ab3409ba7dc..0705759c69f64c6d06e91f7ae83bb8c1ad210f34
100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -2668,6 +2668,10 @@ Arranges for the run time of each program started
by the GCC driver,
   built in any stage, to be logged to @file{time.log}, in the top level of
   the build tree.

+@item @samp{bootstrap-asan}
+Compiles GCC itself using Address Sanitization in order to catch
invalid memory
+accesses within the GCC code.
+
   @end table

   @section Building a cross compiler

[PATCH] dump type attributes in dump_function_to_file

2020-11-20 Thread Martin Sebor via Gcc-patches


dump_function_to_file prints DECL_ATTRIBUTES but not TYPE_ATTRIBUTES
when both can be important and helpful for debugging, especially with
attributes that are added implicitly (such attribute access and
the proposed internal attribute *dealloc).  The function also prints
function arguments (and their types) but not its return type, again,
leaving out a useful detail.  The attached tweak adds both to
the dump.

Martin
gcc/ChangeLog:

	* gcc/tree-cfg.c (dump_function_to_file): Print type attributes
	and return type.

gcc/testsuite/ChangeLog:
	* gcc.dg/attr-access-4.c: New test.

diff --git a/gcc/testsuite/gcc.dg/attr-access-4.c b/gcc/testsuite/gcc.dg/attr-access-4.c
new file mode 100644
index 000..e78b3602ade
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/attr-access-4.c
@@ -0,0 +1,16 @@
+/* { dg-do compile }
+   { dg-options "-fdump-tree-gimple" } */
+
+__attribute__ ((aligned (32)))
+__attribute__ ((access (write_only, 2, 1)))
+void f (int n, void *p)
+{
+  __builtin_memset (p, 0, n);
+}
+
+/* Verify the DECL_ATTRIBUTE "aligned" is mentioned:
+   { dg-final { scan-tree-dump "__attribute__\\(\\(aligned" "gimple" } }
+   and the TYPE_ATTRIBUTE "access" is also mentioned:
+   { dg-final { scan-tree-dump "__attribute__\\(\\(access" "gimple" } }
+   and the function signature including its return type is mentioned:
+   { dg-final { scan-tree-dump "void f *\\(int n, void *\\* *p\\)" "gimple" } } */
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 5139f111fec..138f8ef17e0 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -7966,14 +7966,19 @@ dump_function_to_file (tree fndecl, FILE *file, dump_flags_t flags)
 		  && decl_is_tm_clone (fndecl));
   struct function *fun = DECL_STRUCT_FUNCTION (fndecl);
 
-  if (DECL_ATTRIBUTES (fndecl) != NULL_TREE)
+  tree fntype = TREE_TYPE (fndecl);
+  tree attrs[] = { DECL_ATTRIBUTES (fndecl), TYPE_ATTRIBUTES (fntype) };
+
+  for (int i = 0; i != 2; ++i)
 {
+  if (!attrs[i])
+	continue;
+
   fprintf (file, "__attribute__((");
 
   bool first = true;
   tree chain;
-  for (chain = DECL_ATTRIBUTES (fndecl); chain;
-	   first = false, chain = TREE_CHAIN (chain))
+  for (chain = attrs[i]; chain; first = false, chain = TREE_CHAIN (chain))
 	{
 	  if (!first)
 	fprintf (file, ", ");
@@ -8026,7 +8031,11 @@ dump_function_to_file (tree fndecl, FILE *file, dump_flags_t flags)
 	}
 }
   else
-fprintf (file, "%s %s(", function_name (fun), tmclone ? "[tm-clone] " : "");
+{
+  print_generic_expr (file, TREE_TYPE (fntype), dump_flags);
+  fprintf (file, " %s %s(", function_name (fun),
+	   tmclone ? "[tm-clone] " : "");
+}
 
   arg = DECL_ARGUMENTS (fndecl);
   while (arg)

Re: [PATCH] Power10: Add missing IEEE 128-bit XSCMP* built-in mappings.

2020-11-20 Thread Segher Boessenkool

On Fri, Nov 20, 2020 at 10:42:19AM -0500, Michael Meissner wrote:
> Power10: Add missing IEEE 128-bit XSCMP* built-in mappings.

Okay for trunk (and needed backports after waiting for possible
fallout).  Thanks!


Segher


> 2020-11-18  Michael Meissner  
> 
>   * config/rs6000/rs6000-call.c (rs6000_expand_builtin): Add missing
>   XSCMP* cases for IEEE 128-bit long double.

Re: [Patch 0/X] HWASAN v4

2020-11-20 Thread Matthew Malcomson via Gcc-patches


On 13/11/2020 17:22, Martin Liška wrote:

On 11/13/20 5:57 PM, Matthew Malcomson wrote:

Hi there,

Thanks for the heads-up.
As it turns out the most recent `libhwasan` crashes when displaying an 
address on the stack in Linux.


Hello.

What a bad luck.



I'm currently working on getting it fixed here 
https://reviews.llvm.org/D91344#2393371 
 .
If this hwasan patch series gets approved and if that patch goes in 
would it be feasible to bump the libsanitizer merge to whatever 
version that would be?


If not (maybe because stage1 would be finished?) then could/would we 
end up using the LOCAL_PATCHES approach?


Since now, I would prefer doing cherry picks. Hopefully, we'll end just 
with couple of patches.


That makes sense, there's just one patch I need 83ac1820.

As far as I can tell from the history, the process is simply to apply 
the patch in GCC, commit it with a ChangeLog etc as if it were a GCC 
patch, and then add the hash into LOCAL_PATCHES as a separate commit.


Is that right?


Given that it looks like the hwasan patch series is nearing going in 
now, I'd like to make sure I know how the library is getting added.


Is the plan something like the below?
1) You add the libhwasan update (i.e. this patch you posted).
2) I add the cherry-pick from LLVM compiler-rt (once it's approved)
and a separate commit updating LOCAL_PATCHES.
3) I add the hwasan patch series.

Or would it make more sense for me to apply your patch below with 
`--author` using your details (so it goes in the ChangeLog that way)?


Thanks!
Matthew




Thanks,
Martin



Thanks,
Matthew
-- 


*From:* Martin Liška 
*Sent:* 13 November 2020 16:33
*To:* Matthew Malcomson ; 
gcc-patches@gcc.gnu.org 
*Cc:* ja...@redhat.com ; Richard Earnshaw 
; k...@google.com ; 
do...@redhat.com ; jos...@codesourcery.com 


*Subject:* Re: [Patch 0/X] HWASAN v4
On 10/16/20 11:03 AM, Martin Liï¿½ka wrote:

Hello.

I've just merged libsanitizer and there's the corresponding part that 
includes

libhwasan.

Martin


Hey.

I've just made last merge from upstream, there's corresponding hwasan 
part.


Martin

Re: Update: [PATCH 5/X] libsanitizer: mid-end: Introduce stack variable handling for HWASAN

2020-11-20 Thread Matthew Malcomson via Gcc-patches



Hi there,

I was just doing some double-checks and noticed I'd placed the
documentation in the wrong section of tm.texi.  The `MEMTAG` hooks were
documented in the `Register Classes` section, so I've now moved it to
the `Misc` section.

That's the only change, Ok for trunk?

Matthew






Handling stack variables has three features.

1) Ensure HWASAN required alignment for stack variables

When tagging shadow memory, we need to ensure that each tag granule is
only used by one variable at a time.

This is done by ensuring that each tagged variable is aligned to the tag
granule representation size and also ensure that the end of each
object is aligned to ensure the start of any other data stored on the
stack is in a different granule.

This patch ensures the above by forcing the stack pointer to be aligned
before and after allocating any stack objects. Since we are forcing
alignment we also use `align_local_variable` to ensure this new alignment
is advertised properly through SET_DECL_ALIGN.

2) Put tags into each stack variable pointer

Make sure that every pointer to a stack variable includes a tag of some
sort on it.

The way tagging works is:
  1) For every new stack frame, a random tag is generated.
  2) A base register is formed from the stack pointer value and this
 random tag.
  3) References to stack variables are now formed with RTL describing an
 offset from this base in both tag and value.

The random tag generation is handled by a backend hook.  This hook
decides whether to introduce a random tag or use the stack background
based on the parameter hwasan-random-frame-tag.  Using the stack
background is necessary for testing and bootstrap.  It is necessary
during bootstrap to avoid breaking the `configure` test program for
determining stack direction.

Using the stack background means that every stack frame has the initial
tag of zero and variables are tagged with incrementing tags from 1,
which also makes debugging a bit easier.

Backend hooks define the size of a tag, the layout of the HWASAN shadow
memory, and handle emitting the code that inserts and extracts tags from a
pointer.

3) For each stack variable, tag and untag the shadow stack on function
   prologue and epilogue.

On entry to each function we tag the relevant shadow stack region for
each stack variable. This stack region is tagged to match the tag added to
each pointer to that variable.

This is the first patch where we use the HWASAN shadow space, so we need
to add in the libhwasan initialisation code that creates this shadow
memory region into the binary we produce.  This instrumentation is done
in `compile_file`.

When exiting a function we need to ensure the shadow stack for this
function has no remaining tags.  Without clearing the shadow stack area
for this stack frame, later function calls could get false positives
when those later function calls check untagged areas (such as parameters
passed on the stack) against a shadow stack area with left-over tag.

Hence we ensure that the entire stack frame is cleared on function exit.

config/ChangeLog:

* bootstrap-hwasan.mk: Disable random frame tags for stack-tagging
during bootstrap.

ChangeLog:

* gcc/asan.c (struct hwasan_stack_var): New.
(hwasan_sanitize_p): New.
(hwasan_sanitize_stack_p): New.
(hwasan_sanitize_allocas_p): New.
(initialize_sanitizer_builtins): Define new builtins.
(ATTR_NOTHROW_LIST): New macro.
(hwasan_current_frame_tag): New.
(hwasan_frame_base): New.
(stack_vars_base_reg_p): New.
(hwasan_maybe_init_frame_base_init): New.
(hwasan_record_stack_var): New.
(hwasan_get_frame_extent): New.
(hwasan_increment_frame_tag): New.
(hwasan_record_frame_init): New.
(hwasan_emit_prologue): New.
(hwasan_emit_untag_frame): New.
(hwasan_finish_file): New.
(hwasan_truncate_to_tag_size): New.
* gcc/asan.h (hwasan_record_frame_init): New declaration.
(hwasan_record_stack_var): New declaration.
(hwasan_emit_prologue): New declaration.
(hwasan_emit_untag_frame): New declaration.
(hwasan_get_frame_extent): New declaration.
(hwasan_maybe_enit_frame_base_init): New declaration.
(hwasan_frame_base): New declaration.
(stack_vars_base_reg_p): New declaration.
(hwasan_current_frame_tag): New declaration.
(hwasan_increment_frame_tag): New declaration.
(hwasan_truncate_to_tag_size): New declaration.
(hwasan_finish_file): New declaration.
(hwasan_sanitize_p): New declaration.
(hwasan_sanitize_stack_p): New declaration.
(hwasan_sanitize_allocas_p): New declaration.
(HWASAN_TAG_SIZE): New macro.
(HWASAN_TAG_GRANULE_SIZE): New macro.
(HWASAN_STACK_BACKGROUND): New macro.
* gcc/builtin-types.def (BT_FN_VOID_PTR_UINT8_

Re: [PATCH 4/X] libsanitizer: options: Add hwasan flags and argument parsing

2020-11-20 Thread Matthew Malcomson via Gcc-patches

Hi there,

I was just doing some double-checks and noticed I'd placed the
documentation in the wrong section of tm.texi.  The `MEMTAG` hooks were
documented in the `Register Classes` section, so I've now moved it to
the `Misc` section.

That's the only change, Ok for trunk?

Matthew



These flags can't be used at the same time as any of the other
sanitizers.
We add an equivalent flag to -static-libasan in -static-libhwasan to
ensure static linking.

The -fsanitize=kernel-hwaddress option is for compiling targeting the
kernel.  This flag has defaults to match the LLVM implementation and
sets some other behaviors to work in the kernel (e.g. accounting for
the fact that the stack pointer will have 0xff in the top byte and to not
call the userspace library initialisation routines).
The defaults are that we do not sanitize variables on the stack and
always recover from a detected bug.

Since we are introducing a few more conflicts between sanitizer flags we
refactor the checking for such conflicts to use a helper function which
makes checking for such conflicts more easy and consistent.

We introduce a backend hook `targetm.memtag.can_tag_addresses` that
indicates to the mid-end whether a target has a feature like AArch64 TBI
where the top byte of an address is ignored.
Without this feature hwasan sanitization is not done.

gcc/ChangeLog:

* common.opt (flag_sanitize_recover): Default for kernel
hwaddress.
(static-libhwasan): New cli option.
* config/aarch64/aarch64.c (aarch64_can_tag_addresses): New.
(TARGET_MEMTAG_CAN_TAG_ADDRESSES): New.
* config/gnu-user.h (LIBHWASAN_EARLY_SPEC): hwasan equivalent of
asan command line flags.
* cppbuiltin.c (define_builtin_macros_for_compilation_flags):
Add hwasan equivalent of __SANITIZE_ADDRESS__.
* doc/invoke.texi: Document hwasan command line flags.
* doc/tm.texi: Document new hook.
* doc/tm.texi.in: Document new hook.
* flag-types.h (enum sanitize_code): New sanitizer values.
* gcc.c (STATIC_LIBHWASAN_LIBS): New macro.
(LIBHWASAN_SPEC): New macro.
(LIBHWASAN_EARLY_SPEC): New macro.
(SANITIZER_EARLY_SPEC): Update to include hwasan.
(SANITIZER_SPEC): Update to include hwasan.
(sanitize_spec_function): Use hwasan options.
* opts.c (finish_options): Describe conflicts between address
sanitizers.
(find_sanitizer_argument): New.
(report_conflicting_sanitizer_options): New.
(sanitizer_opts): Introduce new sanitizer flags.
(common_handle_option): Add defaults for kernel sanitizer.
* params.opt (hwasan--instrument-stack): New
(hwasan-random-frame-tag): New
(hwasan-instrument-allocas): New
(hwasan-instrument-reads): New
(hwasan-instrument-writes): New
(hwasan-instrument-mem-intrinsics): New
* target.def (HOOK_PREFIX): Add new hook.
(can_tag_addresses): Add new hook under memtag prefix.
* targhooks.c (default_memtag_can_tag_addresses): New.
* targhooks.h (default_memtag_can_tag_addresses): New decl.
* toplev.c (process_options): Ensure hwasan only on
architectures that advertise the possibility.



### Attachment also inlined for ease of reply###


diff --git a/gcc/common.opt b/gcc/common.opt
index 
fe39b3dee9f270dd39b3f69ff6a0e2e854058703..4e4ba790ce668e490e35c2d95a0b12472754fba4
 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -218,7 +218,7 @@ unsigned int flag_sanitize
 
 ; What sanitizers should recover from errors
 Variable
-unsigned int flag_sanitize_recover = (SANITIZE_UNDEFINED | 
SANITIZE_UNDEFINED_NONDEFAULT | SANITIZE_KERNEL_ADDRESS) & 
~(SANITIZE_UNREACHABLE | SANITIZE_RETURN)
+unsigned int flag_sanitize_recover = (SANITIZE_UNDEFINED | 
SANITIZE_UNDEFINED_NONDEFAULT | SANITIZE_KERNEL_ADDRESS | 
SANITIZE_KERNEL_HWADDRESS) & ~(SANITIZE_UNREACHABLE | SANITIZE_RETURN)
 
 ; What the coverage sanitizers should instrument
 Variable
@@ -3445,6 +3445,9 @@ Driver
 static-libasan
 Driver
 
+static-libhwasan
+Driver
+
 static-libtsan
 Driver
 
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
6de51b521bacb0530799c7cbddb5f6b170bf441c..4f90a49f1b79db406319ddbf89e42f58a695f430
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -23294,6 +23294,15 @@ aarch64_invalid_binary_op (int op ATTRIBUTE_UNUSED, 
const_tree type1,
   return NULL;
 }
 
+/* Implement TARGET_MEMTAG_CAN_TAG_ADDRESSES.  Here we tell the rest of the
+   compiler that we automatically ignore the top byte of our pointers, which
+   allows using -fsanitize=hwaddress.  */
+bool
+aarch64_can_tag_addresses ()
+{
+  return !TARGET_ILP32;
+}
+
 /* Implement TARGET_ASM_FILE_END for AArch64.  This adds the AArch64 GNU NOTE
section at the end if needed.  */
 #define GNU_PROPERTY_AA

[PATCH] unshare expressions in attribute arguments

2020-11-20 Thread Martin Sebor via Gcc-patches


To detect a subset of VLA misuses, the C front associates the bounds
of VLAs in function argument lists with the corresponding variables
by implicitly adding an instance of attribute access to each function
declared to take VLAs with the bound expressions chained on the list
of attribute arguments.

Some of these expressions end up modified by the middle end, which
results in references to nonlocal variables (and perhaps other nodes)
used in these expression getting garbage collected.  A simple example
of this is described in pr97172.

By unsharing the bound expressions the patch below prevents this from
happening (it's not a fix for pr97172).

My understanding of the details of node sharing and garbage collection
in GCC is very limited (I didn't expect a tree to be garbage-collected
if it's still referenced by something).  Is this the right approach
to solving this problem?

Thanks
Martin

diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
index d348e39c27a..4aea4dcafb9 100644
--- a/gcc/c/c-decl.c
+++ b/gcc/c/c-decl.c
@@ -58,7 +58,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "c-family/name-hint.h"
 #include "c-family/known-headers.h"
 #include "c-family/c-spellcheck.h"
-
+#include "gimplify.h"
 #include "tree-pretty-print.h"

 /* In grokdeclarator, distinguish syntactic contexts of declarators.  */
@@ -5780,6 +5780,7 @@ get_parm_array_spec (const struct c_parm *parm, 
tree attrs)

  /* Each variable VLA bound is represented by the dollar
 sign.  */
  spec += "$";
+ nelts = unshare_expr (nelts);
  tpbnds = tree_cons (NULL_TREE, nelts, tpbnds);
}
}
@@ -5834,6 +5835,7 @@ get_parm_array_spec (const struct c_parm *parm, 
tree attrs)


   /* Each variable VLA bound is represented by a dollar sign.  */
   spec += "$";
+  nelts = unshare_expr (nelts);
   vbchain = tree_cons (NULL_TREE, nelts, vbchain);
 }

Re: [PATCH,rs6000] Make MMA builtins use opaque modes [v2]

2020-11-20 Thread Segher Boessenkool

Hi!

As Peter remarked, the input_operand's in the disassemble patterns are
better as something more specific (input_operand has everything a "mov"
pattern can handle, but pretty much nothing else can on a load/store
architecture like Power -- it will likely still work, but only after
reloads, so not optimal).

On Thu, Nov 19, 2020 at 12:58:47PM -0600, acsaw...@linux.ibm.com wrote:
>   Thanks for the reviews, here's the updated patch after fixing those things.
> We now have an UNSPEC for xxsetaccz, and an accompanying change to
> rs6000_rtx_costs to make it be cost 0 so that CSE doesn't try to replace it
> with a bunch of register moves.

> +;; The MMA patterns use the multi-register XOmode and OOmode opaque
> +;; modes to implement the target specific __vector_quad and
> +;; __vector_pair types that the MMA built-in functions reference.  We
> +;; use OPAQUE_MODE to prevent anything from trying to open them up.

Great comment :-)

> +(define_expand "mma_disassemble_pair"
> +  [(match_operand:V16QI 0 "mma_disassemble_output_operand")
> +   (match_operand:OO 1 "input_operand")
> +   (match_operand 2 "const_0_to_1_operand")]
> +  "TARGET_MMA"
> +{
> +  rtx src;
> +  int regoff = INTVAL (operands[2]);
> +  src = gen_rtx_UNSPEC (V16QImode,
> +gen_rtvec (2, operands[1], GEN_INT (regoff)),
> +UNSPEC_MMA_EXTRACT);
> +  emit_move_insn (operands[0], src);
> +  DONE;
> +})

Please use tabs for every leading 8 spaces.

> +(define_insn_and_split "*mma_disassemble_pair"
> +  [(set (match_operand:V16QI 0 "mma_disassemble_output_operand" "=mwa")
> +   (unspec:V16QI [(match_operand:OO 1 "input_operand" "wa")
> +  (match_operand 2 "const_0_to_1_operand")]
> +   UNSPEC_MMA_EXTRACT))]

You do it for the last line here, but not the preceding two.  There are
no lines in mma.md that get this wrong right now :-)

> @@ -14049,21 +14055,21 @@ mma_init_builtins (void)
>   }
>else
>   {
> -   if ((attr & RS6000_BTC_QUAD) == 0)
> +   if ( !(d->code == MMA_BUILTIN_DISASSEMBLE_ACC_INTERNAL
> +  || d->code == MMA_BUILTIN_DISASSEMBLE_PAIR_INTERNAL)
> +&& (attr & RS6000_BTC_QUAD) == 0)
>   attr_args--;

(surplus space right before the !)

Okay for trunk with those nits fixed.  Thanks!

(The input_operand thing can be a follow-up patch; it *does* generate
correct code like this, just not always optimal).


Segher

Re: [PATCH 7/X] libsanitizer: Add tests

2020-11-20 Thread Richard Sandiford via Gcc-patches

Matthew Malcomson  writes:
> Adding hwasan tests.
>
> Only interesting thing here is that we have to make sure the tagging mechanism
> is deterministic to avoid flaky tests.

Sorry for not reviewing this one earlier.  TBH I only spot-checked
the tests themselves (they look good).  But on hwasan-dg.exp: I think
we should try to avoid so much cut-&-paste between asan-dg.exp and
hwasan-dg.exp.

For one thing (and obviously not your fault), it seems odd to me that
check_effective_target_fsanitize_address is defined in asan-dg.exp.
I think it and the new check_effective_targets* should be defined
in target-supports.exp instead.  On:

> +proc check_effective_target_hwaddress_exec {} {
> +if ![check_runtime hwaddress_exec {
> + int main (void) { return 0; }
> +}] {
> + return 0;
> +}
> +return 1;
> +
> +# hwasan doesn't work if there's a ulimit on virtual memory.
> +if ![is_remote target] {
> + if [catch {exec sh -c "ulimit -v"} ulimit_v] {
> + # failed to get ulimit
> + } elseif [regexp {^[0-9]+$} $ulimit_v] {
> + # ulimit -v gave a numeric limit
> + warning "skipping hwasan tests due to ulimit -v"
> + return 0;
> + }
> +}
> +}

either the “hwasan doesn't work” block or the early “return 1” should
be removed.  (I'm guessing the former.)

> +proc hwasan_include_flags {} {
> +global srcdir
> +global TESTING_IN_BUILD_TREE
> +
> +set flags ""
> +
> +if { [is_remote host] || ! [info exists TESTING_IN_BUILD_TREE] } {
> +  return "${flags}"
> +}
> +
> +set flags "-I$srcdir/../../libsanitizer/include"
> +
> +return "$flags"
> +}

This is identical to the asan version, but I guess it's small enough
that the cut-&-paste doesn't matter.

> +
> +#
> +# hwasan_link_flags -- compute library path and flags to find libhwasan.
> +# (originally from g++.exp)
> +#
> +
> +proc hwasan_link_flags { paths } {
> +global srcdir
> +global ld_library_path
> +global shlib_ext
> +global hwasan_saved_library_path
> +
> +set gccpath ${paths}
> +set flags ""
> +
> +set shlib_ext [get_shlib_extension]
> +set hwasan_saved_library_path $ld_library_path
> +
> +if { $gccpath != "" } {
> +  if { [file exists "${gccpath}/libsanitizer/hwasan/.libs/libhwasan.a"]
> +|| [file exists 
> "${gccpath}/libsanitizer/hwasan/.libs/libhwasan.${shlib_ext}"] } {
> +   append flags " -B${gccpath}/libsanitizer/ "
> +   append flags " -B${gccpath}/libsanitizer/hwasan/ "
> +   append flags " -L${gccpath}/libsanitizer/hwasan/.libs "
> +   append ld_library_path ":${gccpath}/libsanitizer/hwasan/.libs"
> +  }
> +} else {
> +  global tool_root_dir
> +
> +  set libhwasan [lookfor_file ${tool_root_dir} libhwasan]
> +  if { $libhwasan != "" } {
> +   append flags "-L${libhwasan} "
> +   append ld_library_path ":${libhwasan}"
> +  }
> +}
> +
> +set_ld_library_path_env_vars
> +
> +return "$flags"
> +}

Here I'd suggest:

- In asan-dg.exp, have:

# Compute library path and flags to find libsanitizer library LIB.
# (originally from g++.exp).
proc asan_link_flags_1 { paths lib } {
…body…
}

…existing comment…
proc asan_link_flags { paths } {
return [asan_link_flags_1 $paths asan]
}

  where …body… is more or less the current body of asan_link_flags with
  “asan” replaced by ${lib}.  E.g.:

global ${lib}_saved_library_path
…
set ${lib}_saved_library_path $ld_library_path

  is fine.  For local variables like:

  set libasan [lookfor_file ${tool_root_dir} libasan]
  if { $libasan != "" } {
  append flags "-L${libasan} "
  append ld_library_path ":${libasan}"
  }

  it would make more sense to use a generic name instead, e.g.:

  set libdir [lookfor_file ${tool_root_dir} lib${lib}]
  if { $libdir != "" } {
  append flags "-L${libdir} "
  append ld_library_path ":${libdir}"
  }

- Have hwasan-dg.exp include asan-dg.exp.

- Have:

proc hwasan_link_flags { paths } {
return [asan_link_flags_1 $paths hwasan]
}

> +
> +#
> +# hwasan_init -- called at the start of each subdir of tests
> +#
> +
> +proc hwasan_init { args } {
> +global TEST_ALWAYS_FLAGS
> +global ALWAYS_CXXFLAGS
> +global TOOL_OPTIONS
> +global hwasan_saved_TEST_ALWAYS_FLAGS
> +global hwasan_saved_ALWAYS_CXXFLAGS
> +
> +setenv HWASAN_OPTIONS "random_tags=0"
> +
> +set link_flags ""
> +if ![is_remote host] {
> + if [info exists TOOL_OPTIONS] {
> + set link_flags "[hwasan_link_flags [get_multilibs ${TOOL_OPTIONS}]]"
> + } else {
> + set link_flags "[hwasan_link_flags [get_multilibs]]"
> + }
> +}
> +
> +set include_flags "[hwasan_include_flags]"
> +
> +if [info exists TEST_ALWAYS_FLAGS] {
> + set hwasan_saved_TEST_ALWAYS_FLAGS $TEST_ALWAYS_FLAGS
> +}
> +if [info exists ALWAYS_CXXFLAGS] {
> + set hwasa

Re: [PATCH v2] tree-ssa-threadbackward.c (profitable_jump_thread_path): Do not allow __builtin_constant_p () before IPA.

2020-11-20 Thread Jeff Law via Gcc-patches

On 6/30/20 12:46 PM, Ilya Leoshkevich wrote:
> v1: https://gcc.gnu.org/pipermail/gcc-patches/2020-June/547236.html
>
> This is the implementation of Jakub's suggestion: allow
> __builtin_constant_p () after IPA, but fold it into 0.  Smoke test
> passed on s390x-redhat-linux, full regtest and bootstrap are running on
> x86_64-redhat-linux.
>
> ---
>
> Linux Kernel (specifically, drivers/leds/trigger/ledtrig-cpu.c) build
> with GCC 10 fails on s390 with "impossible constraint".
>
> The problem is that jump threading makes __builtin_constant_p () lie
> when it splits a path containing a non-constant expression in a way
> that on each of the resulting paths this expression is constant.
>
> Fix by disallowing __builtin_constant_p () on threading paths before
> IPA and fold it into 0 after IPA.
>
> gcc/ChangeLog:
>
> 2020-06-30  Ilya Leoshkevich  
>
>   * tree-ssa-threadbackward.c (thread_jumps::m_allow_bcp_p): New
>   member.
>   (thread_jumps::profitable_jump_thread_path): Do not allow
>   __builtin_constant_p () on threading paths unless m_allow_bcp_p
>   is set.
>   (thread_jumps::find_jump_threads_backwards): Set m_allow_bcp_p.
>   (pass_thread_jumps::execute): Allow __builtin_constant_p () on
>   threading paths after IPA.
>   (pass_early_thread_jumps::execute): Do not allow
>   __builtin_constant_p () on threading paths before IPA.
>   * tree-ssa-threadupdate.c (duplicate_thread_path): Fold
>   __builtin_constant_p () on threading paths into 0.
>
> gcc/testsuite/ChangeLog:
>
> 2020-06-30  Ilya Leoshkevich  
>
>   * gcc.target/s390/builtin-constant-p-threading.c: New test.
So I'm finally getting back to this.  Thanks for your patience.

It's a nasty little problem, and I suspect there's actually some deeper
issues here.  While I'd like to claim its a bad use of b_c_p, I don't
think I can reasonably make that argument.

So what we have is a b_c_p at the start of an if-else chain.  Subsequent
tests on the "true" arm of the the b_c_p test may throw us off the
constant path (because the constants are out of range).  Once all the
tests are passed (it's constant and the constant is in range) the true
arm's terminal block has a special asm that requires a constant
argument.   In the case where we get to the terminal block on the true
arm, the argument to the b_c_p is used as the constant argument to the
special asm.

At first glace jump threading seems to be doing the right thing.  Except
that we end up with two paths to that terminal block with the special
asm, one for each of the two constant arguments to the b_c_p call. 
Naturally since that same value is used in the asm, we have to introduce
a PHI to select between them at the head of the terminal block.   Now
the argument in the asm is no longer constant and boom we fail.

I briefly pondered if we should only throttle when the argument to the
b_c_p is not used elsewhere.  But I think that just hides the problem
and with a little work I could probably extend the testcase to still
fail in that scenario.

I also briefly pondered if we should isolate the terminal block as well
(essentially creating one for each unique PHI argument).  We'd likely
only need to do that when there's an ASM in the terminal block, but that
likely just papers over the problem as well since the ASM could be in a
successor of the terminal block.

I haven't thought real deeply about it, but I wouldn't be surprised if
there's other passes that can trigger similar problems.  Aggressive
cross-jumping would be the most obvious, but some of the hosting/sinking
of operations past PHIs would seem potentially problematical as well.

Jakub suggestion might be the best one in this space.   I don't have
anything better right now.  The deeper questions about other passes
setting up similar scenarios can probably be punted, I'd expect
threading to be far and above the most common way for this to happen and
I'd be comfortable faulting in investigation of other cases if/when they
happen.

So I retract my initial objections.  Let's go with the V2 patch.

jeff

[r11-5191 Regression] FAIL: gcc.target/i386/pr97873-1.c scan-assembler pabsq on Linux/x86_64

2020-11-20 Thread sunil.k.pandey via Gcc-patches

On Linux/x86_64,

fdace7584056de2f63bde2e3087f26beb6b0f97d is the first bad commit
commit fdace7584056de2f63bde2e3087f26beb6b0f97d
Author: Uros Bizjak 
Date:   Fri Nov 20 10:26:34 2020 +0100

i386: Optimize abs expansion [PR97873]

caused

FAIL: gcc.target/i386/pr97873-1.c scan-assembler pabsq

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-5191/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr97873-1.c --target_board='unix{-m32\ 
-march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)

[PATCH] c++: Add missing verify_type_context call [PR97904]

2020-11-20 Thread Richard Sandiford via Gcc-patches

When adding the verify_type_context target hook, I'd missed
a site that needs to check an array element type.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK for master
and GCC 10 branch?

Thanks,
Richard


gcc/cp/
PR c++/97904
* pt.c (tsubst): Use verify_type_context to check the type
of an array element.

gcc/testsuite/
PR c++/97904
* g++.dg/ext/sve-sizeless-1.C: Add more template tests.
* g++.dg/ext/sve-sizeless-2.C: Likewise.
---
 gcc/cp/pt.c   |  4 +++
 gcc/testsuite/g++.dg/ext/sve-sizeless-1.C | 33 +--
 gcc/testsuite/g++.dg/ext/sve-sizeless-2.C | 33 +--
 3 files changed, 66 insertions(+), 4 deletions(-)

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 463b1c3a57d..89fec98ad67 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -15867,6 +15867,10 @@ tsubst (tree t, tree args, tsubst_flags_t complain, 
tree in_decl)
return error_mark_node;
  }
 
+   if (!verify_type_context (input_location, TCTX_ARRAY_ELEMENT, type,
+ !(complain & tf_error)))
+ return error_mark_node;
+
r = build_cplus_array_type (type, domain);
 
if (!valid_array_size_p (input_location, r, in_decl,
diff --git a/gcc/testsuite/g++.dg/ext/sve-sizeless-1.C 
b/gcc/testsuite/g++.dg/ext/sve-sizeless-1.C
index 7f829220c71..9f05ca5a855 100644
--- a/gcc/testsuite/g++.dg/ext/sve-sizeless-1.C
+++ b/gcc/testsuite/g++.dg/ext/sve-sizeless-1.C
@@ -72,10 +72,37 @@ template class templated_struct4;
 template struct templated_struct5 : T {}; // { dg-error {base type 
'[^']*' fails to be a struct or class type} }
 template class templated_struct5;
 
+template struct templated_struct6 { T x[N]; }; // { 
dg-error {array elements cannot have SVE type '(__SVInt8_t|svint8_t)'} }
+template class templated_struct6;
+
+template
+struct templated_struct7 {
+  static const int size = sizeof (T); // { dg-error {SVE type 
'(__SVInt8_t|svint8_t)' does not have a fixed size} }
+#if __cplusplus >= 201103L
+  static const int align = alignof (T); // { dg-error {SVE type 
'(__SVInt8_t|svint8_t)' does not have a defined alignment} "" { target c++11 } }
+#endif
+
+  void f1 (T (&)[2]); // { dg-error {array elements cannot have SVE type 
'(__SVInt8_t|svint8_t)'} }
+#if __cplusplus >= 201103L
+  auto f2 () -> decltype (new T); // { dg-error {cannot allocate objects with 
SVE type '(__SVInt8_t|svint8_t)'} "" { target c++11 } }
+  auto f3 (T *a) -> decltype (delete a); // { dg-error {cannot delete objects 
with SVE type '(__SVInt8_t|svint8_t)'} "" { target c++11 } }
+#else
+  void f2 () throw (T); // { dg-error {cannot throw or catch SVE type 
'(__SVInt8_t|svint8_t)'} "" { target c++98_only } }
+#endif
+};
+template class templated_struct7;
+
+template struct templated_struct8 { typedef int type; };
+
+template
+void sfinae_f1 (typename templated_struct8::type);
+template
+void sfinae_f1 (T &);
+
 #if __cplusplus >= 201103L
 template using typedef_sizeless1 = svint8_t;
 template using typedef_sizeless1 = svint8_t;
-template using array = T[2];
+template using array = T[2]; // { dg-error {array elements cannot 
have SVE type '(svint8_t|__SVInt8_t)'} "" { target c++11 } }
 #endif
 
 // Pointers to sizeless types.
@@ -119,7 +146,7 @@ statements (int n)
   __alignof (ext_produce_sve_sc ()); // { dg-error {SVE type 'svint8_t' does 
not have a defined alignment} }
 
 #if __cplusplus >= 201103L
-  array foo = {}; // { dg-error {array elements cannot have SVE type 
'(svint8_t|__SVInt8_t)'} "" { target c++11 } }
+  array foo = {}; // { dg-message {required from here} "" { target 
c++11 } }
 #endif
 
   // Initialization.
@@ -298,6 +325,8 @@ statements (int n)
   thrower2 ();
 #endif
 
+  sfinae_f1 (sve_sc1);
+
   // Use in traits.  Doesn't use static_assert so that tests work with
   // earlier -std=s.
 
diff --git a/gcc/testsuite/g++.dg/ext/sve-sizeless-2.C 
b/gcc/testsuite/g++.dg/ext/sve-sizeless-2.C
index 40b65d37f8a..0b86d9e8217 100644
--- a/gcc/testsuite/g++.dg/ext/sve-sizeless-2.C
+++ b/gcc/testsuite/g++.dg/ext/sve-sizeless-2.C
@@ -72,10 +72,37 @@ template class templated_struct4;
 template struct templated_struct5 : T {}; // { dg-error {base type 
'[^']*' fails to be a struct or class type} }
 template class templated_struct5;
 
+template struct templated_struct6 { T x[N]; }; // { 
dg-error {array elements cannot have SVE type '(__SVInt8_t|svint8_t)'} }
+template class templated_struct6;
+
+template
+struct templated_struct7 {
+  static const int size = sizeof (T); // { dg-error {SVE type 
'(__SVInt8_t|svint8_t)' does not have a fixed size} }
+#if __cplusplus >= 201103L
+  static const int align = alignof (T); // { dg-error {SVE type 
'(__SVInt8_t|svint8_t)' does not have a defined alignment} "" { target c++11 } }
+#endif
+
+  void f1 (T (&)[2]); // { dg-error {array elements cannot have SVE type 
'(__SVInt8_t|svint8_t)'} }
+#if __cplusplus >= 201103L
+  auto f2 () -> decltype (new T); //

Re: [PATCH] unshare expressions in attribute arguments

2020-11-20 Thread Marek Polacek via Gcc-patches

On Fri, Nov 20, 2020 at 12:00:58PM -0700, Martin Sebor via Gcc-patches wrote:
> To detect a subset of VLA misuses, the C front associates the bounds
> of VLAs in function argument lists with the corresponding variables
> by implicitly adding an instance of attribute access to each function
> declared to take VLAs with the bound expressions chained on the list
> of attribute arguments.
> 
> Some of these expressions end up modified by the middle end, which
> results in references to nonlocal variables (and perhaps other nodes)
> used in these expression getting garbage collected.  A simple example
> of this is described in pr97172.
> 
> By unsharing the bound expressions the patch below prevents this from
> happening (it's not a fix for pr97172).
> 
> My understanding of the details of node sharing and garbage collection
> in GCC is very limited (I didn't expect a tree to be garbage-collected
> if it's still referenced by something).  Is this the right approach
> to solving this problem?

ISTM that a more natural thing would be to use build_distinct_type_copy
to copy the type you're about to modify.

> diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
> index d348e39c27a..4aea4dcafb9 100644
> --- a/gcc/c/c-decl.c
> +++ b/gcc/c/c-decl.c
> @@ -58,7 +58,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "c-family/name-hint.h"
>  #include "c-family/known-headers.h"
>  #include "c-family/c-spellcheck.h"
> -
> +#include "gimplify.h"
>  #include "tree-pretty-print.h"
> 
>  /* In grokdeclarator, distinguish syntactic contexts of declarators.  */
> @@ -5780,6 +5780,7 @@ get_parm_array_spec (const struct c_parm *parm, tree
> attrs)
>   /* Each variable VLA bound is represented by the dollar
>  sign.  */
>   spec += "$";
> + nelts = unshare_expr (nelts);
>   tpbnds = tree_cons (NULL_TREE, nelts, tpbnds);
> }
> }
> @@ -5834,6 +5835,7 @@ get_parm_array_spec (const struct c_parm *parm, tree
> attrs)
> 
>/* Each variable VLA bound is represented by a dollar sign.  */
>spec += "$";
> +  nelts = unshare_expr (nelts);
>vbchain = tree_cons (NULL_TREE, nelts, vbchain);
>  }
> 

Marek

Re: [PATCH] c++: Add missing verify_type_context call [PR97904]

2020-11-20 Thread Marek Polacek via Gcc-patches

On Fri, Nov 20, 2020 at 07:27:54PM +, Richard Sandiford via Gcc-patches 
wrote:
> When adding the verify_type_context target hook, I'd missed
> a site that needs to check an array element type.
> 
> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK for master
> and GCC 10 branch?

Not an approval, but looks fine to me, and matches the other uses of
verify_type_context I've checked in the C++ FE.

> gcc/cp/
>   PR c++/97904
>   * pt.c (tsubst): Use verify_type_context to check the type
>   of an array element.
> 
> gcc/testsuite/
>   PR c++/97904
>   * g++.dg/ext/sve-sizeless-1.C: Add more template tests.
>   * g++.dg/ext/sve-sizeless-2.C: Likewise.
> ---
>  gcc/cp/pt.c   |  4 +++
>  gcc/testsuite/g++.dg/ext/sve-sizeless-1.C | 33 +--
>  gcc/testsuite/g++.dg/ext/sve-sizeless-2.C | 33 +--
>  3 files changed, 66 insertions(+), 4 deletions(-)
> 
> diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
> index 463b1c3a57d..89fec98ad67 100644
> --- a/gcc/cp/pt.c
> +++ b/gcc/cp/pt.c
> @@ -15867,6 +15867,10 @@ tsubst (tree t, tree args, tsubst_flags_t complain, 
> tree in_decl)
>   return error_mark_node;
> }
>  
> + if (!verify_type_context (input_location, TCTX_ARRAY_ELEMENT, type,
> +   !(complain & tf_error)))
> +   return error_mark_node;
> +
>   r = build_cplus_array_type (type, domain);
>  
>   if (!valid_array_size_p (input_location, r, in_decl,
> diff --git a/gcc/testsuite/g++.dg/ext/sve-sizeless-1.C 
> b/gcc/testsuite/g++.dg/ext/sve-sizeless-1.C
> index 7f829220c71..9f05ca5a855 100644
> --- a/gcc/testsuite/g++.dg/ext/sve-sizeless-1.C
> +++ b/gcc/testsuite/g++.dg/ext/sve-sizeless-1.C
> @@ -72,10 +72,37 @@ template class templated_struct4;
>  template struct templated_struct5 : T {}; // { dg-error {base 
> type '[^']*' fails to be a struct or class type} }
>  template class templated_struct5;
>  
> +template struct templated_struct6 { T x[N]; }; // { 
> dg-error {array elements cannot have SVE type '(__SVInt8_t|svint8_t)'} }
> +template class templated_struct6;
> +
> +template
> +struct templated_struct7 {
> +  static const int size = sizeof (T); // { dg-error {SVE type 
> '(__SVInt8_t|svint8_t)' does not have a fixed size} }
> +#if __cplusplus >= 201103L
> +  static const int align = alignof (T); // { dg-error {SVE type 
> '(__SVInt8_t|svint8_t)' does not have a defined alignment} "" { target c++11 
> } }
> +#endif
> +
> +  void f1 (T (&)[2]); // { dg-error {array elements cannot have SVE type 
> '(__SVInt8_t|svint8_t)'} }
> +#if __cplusplus >= 201103L
> +  auto f2 () -> decltype (new T); // { dg-error {cannot allocate objects 
> with SVE type '(__SVInt8_t|svint8_t)'} "" { target c++11 } }
> +  auto f3 (T *a) -> decltype (delete a); // { dg-error {cannot delete 
> objects with SVE type '(__SVInt8_t|svint8_t)'} "" { target c++11 } }
> +#else
> +  void f2 () throw (T); // { dg-error {cannot throw or catch SVE type 
> '(__SVInt8_t|svint8_t)'} "" { target c++98_only } }
> +#endif
> +};
> +template class templated_struct7;
> +
> +template struct templated_struct8 { typedef int type; };
> +
> +template
> +void sfinae_f1 (typename templated_struct8::type);
> +template
> +void sfinae_f1 (T &);
> +
>  #if __cplusplus >= 201103L
>  template using typedef_sizeless1 = svint8_t;
>  template using typedef_sizeless1 = svint8_t;
> -template using array = T[2];
> +template using array = T[2]; // { dg-error {array elements 
> cannot have SVE type '(svint8_t|__SVInt8_t)'} "" { target c++11 } }
>  #endif
>  
>  // Pointers to sizeless types.
> @@ -119,7 +146,7 @@ statements (int n)
>__alignof (ext_produce_sve_sc ()); // { dg-error {SVE type 'svint8_t' does 
> not have a defined alignment} }
>  
>  #if __cplusplus >= 201103L
> -  array foo = {}; // { dg-error {array elements cannot have SVE 
> type '(svint8_t|__SVInt8_t)'} "" { target c++11 } }
> +  array foo = {}; // { dg-message {required from here} "" { target 
> c++11 } }
>  #endif
>  
>// Initialization.
> @@ -298,6 +325,8 @@ statements (int n)
>thrower2 ();
>  #endif
>  
> +  sfinae_f1 (sve_sc1);
> +
>// Use in traits.  Doesn't use static_assert so that tests work with
>// earlier -std=s.
>  
> diff --git a/gcc/testsuite/g++.dg/ext/sve-sizeless-2.C 
> b/gcc/testsuite/g++.dg/ext/sve-sizeless-2.C
> index 40b65d37f8a..0b86d9e8217 100644
> --- a/gcc/testsuite/g++.dg/ext/sve-sizeless-2.C
> +++ b/gcc/testsuite/g++.dg/ext/sve-sizeless-2.C
> @@ -72,10 +72,37 @@ template class templated_struct4;
>  template struct templated_struct5 : T {}; // { dg-error {base 
> type '[^']*' fails to be a struct or class type} }
>  template class templated_struct5;
>  
> +template struct templated_struct6 { T x[N]; }; // { 
> dg-error {array elements cannot have SVE type '(__SVInt8_t|svint8_t)'} }
> +template class templated_struct6;
> +
> +template
> +struct templated_struct7 {
> +  static const int size = sizeof (T); // { dg-error {S

Improve hashing of decls in ipa-icf-gimple

2020-11-20 Thread Jan Hubicka

Hi,
another remaining case is that we end up comparing calls with mismatching
number of parameters or with different permutations of them.

This is because we hash decls to nothing. This patch improves that by
hashing decls by their code and parm decls by indexes that are stable.
Also for defualt defs in SSA_NAMEs we can add the corresponding decl (that
is usually parm decls).

Still we could improve on this by hasing ssa names by their definit parameters
and possibly making maps of other decls and assigning them stable function
local IDs.

Bootstrapped/regtested x86_64-linux, comitted.

* ipa-icf-gimple.c (func_checker::hash_operand): Improve hashing of
decls.
diff --git a/gcc/ipa-icf-gimple.c b/gcc/ipa-icf-gimple.c
index 250f02391db..7e2b3c4624c 100644
--- a/gcc/ipa-icf-gimple.c
+++ b/gcc/ipa-icf-gimple.c
@@ -229,13 +242,29 @@ func_checker::hash_operand (const_tree arg, inchash::hash 
&hstate,
 
   switch (TREE_CODE (arg))
 {
+case PARM_DECL:
+  {
+   unsigned int index = 0;
+   if (DECL_CONTEXT (arg))
+ for (tree p = DECL_ARGUMENTS (DECL_CONTEXT (arg));
+  p && index < 32; p = DECL_CHAIN (p), index++)
+   if (p == arg)
+ break;
+   hstate.add_int (PARM_DECL);
+   hstate.add_int (index);
+  }
+  return;
 case FUNCTION_DECL:
 case VAR_DECL:
 case LABEL_DECL:
-case PARM_DECL:
 case RESULT_DECL:
 case CONST_DECL:
+  hstate.add_int (TREE_CODE (arg));
+  return;
 case SSA_NAME:
+  hstate.add_int (SSA_NAME);
+  if (SSA_NAME_IS_DEFAULT_DEF (arg))
+   hash_operand (SSA_NAME_VAR (arg), hstate, flags);
   return;
 case FIELD_DECL:
   inchash::add_expr (DECL_FIELD_OFFSET (arg), hstate, flags);
@@ -252,6 +281,8 @@ func_checker::hash_operand (const_tree arg, inchash::hash 
&hstate,
   hstate.add_int (0xc10bbe5);
   return;
 }
+  gcc_assert (!DECL_P (arg));
+  gcc_assert (!TYPE_P (arg));
 
   return operand_compare::hash_operand (arg, hstate, flags);
 }

Re: [PATCH] ipa-cp: Avoid unwanted multiple propagations (PR 97816)

2020-11-20 Thread Martin Jambor

Hi,

this is an updated patch based on our conversation on IRC today.  So far
I have had a look at the effects on only tramp3d and although it makes
the heuristics more pessimistic more times than optimistic (number of
clones at -Ofast drops from 559 to 557), there are also lattices which
are massively boosted.

When looking at the testcase of PR 97816 I realized that the reason
why we were hitting overflows in size growth estimates in IPA-CP is
not because the chains of how lattices feed values to each other are
so long but mainly because we add estimates in callee lattices to
caller lattices for each value source, which roughly corresponds to a
call graph edge, and therefore if there are multiple calls between two
functions passing the same value in a parameter we end up doing it
more than once, sometimes actually quite many times.

This patch avoids it by using a has_set to remember the source values
we have already updated and not increasing their size again.
Furhtermore, to improve estimation of times we scale the propagated
time benefits with edge frequencies as we accumulate them.

This should make any overflows very unlikely but not impossible, so I
still included checks for overflows but decided to restructure the
code to only need it in the propagate_effects function and modified it
so that it does not need to perform the check before each sum.

This is because I decided to add local estimates to propagated
estimates already in propagate_effects and not at the evaluation time.
The function can then do the sums in a wide type and discard them in
the unlikely case of an overflow.  I also decided to use the
opportunity to make propagated effect stats now include stats from
other values in the same SCCs.  In the dumps I have seen this tended
to increase size cost a tiny bit more than the estimated time benefit
but both increases were small.

Bootstrapped and LTO bootstrapped on x86_64-linux.  OK for trunk?

Thanks,

Martin


gcc/ChangeLog:

2020-11-20  Martin Jambor  

PR ipa/97816
* ipa-cp.c (safe_add): Removed.
(good_cloning_opportunity_p): Remove special handling of INT_MAX.
(value_topo_info::propagate_effects): Take care not to
propagate from size one value to another through more sources.  Scale
propagated times with edge frequencies.  Include local time and size
in propagates ones here.  Take care not to overflow size.
(decide_about_value): Do not add local and propagated effects when
passing them to good_cloning_opportunity_p.
---
 gcc/ipa-cp.c | 68 +---
 1 file changed, 32 insertions(+), 36 deletions(-)

diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index c3ee71e16e1..863b4f7d228 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -3264,13 +3264,6 @@ good_cloning_opportunity_p (struct cgraph_node *node, 
sreal time_benefit,
 return false;
 
   gcc_assert (size_cost > 0);
-  if (size_cost == INT_MAX)
-{
-  if (dump_file && (dump_flags & TDF_DETAILS))
-   fprintf (dump_file, " good_cloning_opportunity_p returning "
-"false because of size overflow.\n");
-  return false;
-}
 
   class ipa_node_params *info = IPA_NODE_REF (node);
   int eval_threshold = opt_for_fn (node->decl, param_ipa_cp_eval_threshold);
@@ -3840,20 +3833,6 @@ propagate_constants_topo (class ipa_topo_info *topo)
 }
 }
 
-
-/* Return the sum of A and B if none of them is bigger than INT_MAX/2, return
-   INT_MAX.  */
-
-static int
-safe_add (int a, int b)
-{
-  if (a > INT_MAX/2 || b > INT_MAX/2)
-return INT_MAX;
-  else
-return a + b;
-}
-
-
 /* Propagate the estimated effects of individual values along the topological
from the dependent values to those they depend on.  */
 
@@ -3862,30 +3841,51 @@ void
 value_topo_info::propagate_effects ()
 {
   ipcp_value *base;
+  hash_set *> processed_srcvals;
 
   for (base = values_topo; base; base = base->topo_next)
 {
   ipcp_value_source *src;
   ipcp_value *val;
   sreal time = 0;
-  int size = 0;
+  HOST_WIDE_INT size = 0;
 
   for (val = base; val; val = val->scc_next)
{
  time = time + val->local_time_benefit + val->prop_time_benefit;
- size = safe_add (size, safe_add (val->local_size_cost,
-  val->prop_size_cost));
+ size = size + val->local_size_cost + val->prop_size_cost;
}
 
   for (val = base; val; val = val->scc_next)
-   for (src = val->sources; src; src = src->next)
- if (src->val
- && src->cs->maybe_hot_p ())
+   {
+ processed_srcvals.empty ();
+ for (src = val->sources; src; src = src->next)
+   if (src->val
+   && src->cs->maybe_hot_p ())
+ {
+   if (!processed_srcvals.add (src->val))
+ {
+   HOST_WIDE_INT prop_size = size + src->val->prop_size_cost;
+   if (

Re: [PATCH] ipa: special pass-through op for Fortran strides

2020-11-20 Thread Jeff Law via Gcc-patches




On 6/12/20 3:25 PM, Martin Jambor wrote:
> Hi,
>
> when Fortran functions pass array descriptors they receive as a
> parameter to another function, they actually rebuild it.  Thanks to
> work done mainly by Feng, IPA-CP can already handle the cases when
> they pass directly the values loaded from the original descriptor.
> Unfortunately, perhaps the most important one, stride, is first
> checked against zero and is replaced with one in that case:
>
>   _12 = *a_11(D).dim[0].stride;
>   if (_12 != 0)
> goto ; [50.00%]
>   else
> goto ; [50.00%]
>
>   
> // empty BB
>   
>   # iftmp.22_9 = PHI <_12(2), 1(3)>
>...
>parm.6.dim[0].stride = iftmp.22_9;
>...
>__x_MOD_foo (&parm.6, b_31(D));
>
> in the most important and hopefully common cases, the incoming value
> is already 1 and we fail to propagate it.
>
> I would therefore like to propose the following way of encoding this
> situation in pass-through jump functions using using ASSERTT_EXPR
> operation code meaning that if the incoming value is the same as the
> "operand" in the jump function, it is passed on, otherwise the result
> is unknown.  This of course captures only the single (but most
> important) case but is an improvement and does not need enlarging the
> jump function structure and is simple to pattern match.  Encoding that
> zero needs to be changed to one would need another field and matching
> it would be slightly more complicated too.
>
> Bootstrapped and tested on x86_64-linux, LTO bootstrap is underway.  OK
> if it passes?
>
> Thanks,
>
> Martin
>
>
> 2020-06-12  Martin Jambor  
>
>   * ipa-prop.h (ipa_pass_through_data): Expand comment describing
>   operation.
>   * ipa-prop.c (analyze_agg_content_value): Detect new special case and
>   encode it as ASSERT_EXPR.
>   * ipa-cp.c (values_equal_for_ipcp_p): Move before
>   ipa_get_jf_arith_result.
>   (ipa_get_jf_arith_result): Special case ASSERT_EXPR.
>
>   testsuite/
>   * gfortran.dg/ipcp-array-2.f90: New test.
I don't see any feedback on this (old) patch.

It's not obvious from the patch, but I get the impression that the
ASSERT_EXPR isn't actually added to the IL, but instead just gets
stuffed into the agg_value structure.  Can you confirm one way or the
other.  If it ends up in the IL then we need to make sure it doesn't
escape the pass.

Assuming we don't ultimately extract the info from the agg_value
structure and add it to the IL, I don't have any significant concerns
about this patch.  Do you want to re-test it and include it in gcc-11?

jeff

Re: [PATCH] unshare expressions in attribute arguments

2020-11-20 Thread Martin Sebor via Gcc-patches


On 11/20/20 12:29 PM, Marek Polacek wrote:

On Fri, Nov 20, 2020 at 12:00:58PM -0700, Martin Sebor via Gcc-patches wrote:

To detect a subset of VLA misuses, the C front associates the bounds
of VLAs in function argument lists with the corresponding variables
by implicitly adding an instance of attribute access to each function
declared to take VLAs with the bound expressions chained on the list
of attribute arguments.

Some of these expressions end up modified by the middle end, which
results in references to nonlocal variables (and perhaps other nodes)
used in these expression getting garbage collected.  A simple example
of this is described in pr97172.

By unsharing the bound expressions the patch below prevents this from
happening (it's not a fix for pr97172).

My understanding of the details of node sharing and garbage collection
in GCC is very limited (I didn't expect a tree to be garbage-collected
if it's still referenced by something).  Is this the right approach
to solving this problem?


ISTM that a more natural thing would be to use build_distinct_type_copy
to copy the type you're about to modify.


The get_parm_array_spec function doesn't modify a type.  It's called
from push_parm_decl() to build an "arg spec" attribute with the VLA
bounds as arguments.  push_parm_decl() then adds the attribute to
the function's PARM_DECL by calling decl_attributes().  When all of
the function's parameters have been processed the "arg specs" are
then extracted and added as an attribute access specification with
the VLA bounds added to the function declaration.

Martin




diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
index d348e39c27a..4aea4dcafb9 100644
--- a/gcc/c/c-decl.c
+++ b/gcc/c/c-decl.c
@@ -58,7 +58,7 @@ along with GCC; see the file COPYING3.  If not see
  #include "c-family/name-hint.h"
  #include "c-family/known-headers.h"
  #include "c-family/c-spellcheck.h"
-
+#include "gimplify.h"
  #include "tree-pretty-print.h"

  /* In grokdeclarator, distinguish syntactic contexts of declarators.  */
@@ -5780,6 +5780,7 @@ get_parm_array_spec (const struct c_parm *parm, tree
attrs)
   /* Each variable VLA bound is represented by the dollar
  sign.  */
   spec += "$";
+ nelts = unshare_expr (nelts);
   tpbnds = tree_cons (NULL_TREE, nelts, tpbnds);
 }
 }
@@ -5834,6 +5835,7 @@ get_parm_array_spec (const struct c_parm *parm, tree
attrs)

/* Each variable VLA bound is represented by a dollar sign.  */
spec += "$";
+  nelts = unshare_expr (nelts);
vbchain = tree_cons (NULL_TREE, nelts, vbchain);
  }



Marek

Re: [PATCH] unshare expressions in attribute arguments

2020-11-20 Thread Jakub Jelinek via Gcc-patches

On Fri, Nov 20, 2020 at 01:28:03PM -0700, Martin Sebor via Gcc-patches wrote:
> On 11/20/20 12:29 PM, Marek Polacek wrote:
> > On Fri, Nov 20, 2020 at 12:00:58PM -0700, Martin Sebor via Gcc-patches 
> > wrote:
> > > To detect a subset of VLA misuses, the C front associates the bounds
> > > of VLAs in function argument lists with the corresponding variables
> > > by implicitly adding an instance of attribute access to each function
> > > declared to take VLAs with the bound expressions chained on the list
> > > of attribute arguments.
> > > 
> > > Some of these expressions end up modified by the middle end, which
> > > results in references to nonlocal variables (and perhaps other nodes)
> > > used in these expression getting garbage collected.  A simple example
> > > of this is described in pr97172.
> > > 
> > > By unsharing the bound expressions the patch below prevents this from
> > > happening (it's not a fix for pr97172).
> > > 
> > > My understanding of the details of node sharing and garbage collection
> > > in GCC is very limited (I didn't expect a tree to be garbage-collected
> > > if it's still referenced by something).  Is this the right approach
> > > to solving this problem?
> > 
> > ISTM that a more natural thing would be to use build_distinct_type_copy
> > to copy the type you're about to modify.
> 
> The get_parm_array_spec function doesn't modify a type.  It's called
> from push_parm_decl() to build an "arg spec" attribute with the VLA
> bounds as arguments.  push_parm_decl() then adds the attribute to
> the function's PARM_DECL by calling decl_attributes().  When all of
> the function's parameters have been processed the "arg specs" are
> then extracted and added as an attribute access specification with
> the VLA bounds added to the function declaration.

Guess it isn't that the trees would be GC collected, that can't happen if
they are referenced from reachable trees, but the thing is that the
gimplifier is destructive, it overwrites various trees as it is gimplifying
function bodies.  That is why the function bodies are normally unshared, but
that unsharing doesn't really walk function attributes.
On the other side, for VLAs unsharing is quite harmful, e.g. if there is
  int vla[foo ()];
then if it is unshared (except when it is a SAVE_EXPR that wouldn't be
unshared), then it could call the foo () function multiple times.
For VLA bounds in PARM_DECLs we are hopefully more restricted than that,
if it involves only other PARM_DECLs and constants and expressions composed
of them, the unsharing could be fine.

Jakub

Go patch committed: Change name mangling convention

2020-11-20 Thread Ian Lance Taylor via Gcc-patches

This patch to the Go frontend and libgo changes the name mangling
convention.  The previous convention (which was actually the second
one) turned out to be ambiguous when the path to a package contained a
dot; this is a common case, as many package paths are of the form
"github.com/name/package".  The previous convention also did not
support package paths that start with a digit, which is less common
but does occur (https://golang.org/issue/41862).

This patch rewrites and somewhat simplifies the naming convention.
Now dot is used only as a separator character and for special names.
Actual name mangling, for representing Unicode characters and other
non-ASCII alphanumerics, is now done with an underscore.  This has the
advantage of being simpler, in that it avoids the overloading that the
previous convention applied to dot.  It has the disadvantage that
mangled symbol names look somewhat like valid Go names, since valid Go
names can of course contain underscore.  Still, it seems like the best
choice.

This patch increments the libgo major version number, since many
symbol names have changed.

Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian


patch.txt.gz
Description: application/gzip

Re: [PATCH] tighten up attribute access validation (PR 97879)

2020-11-20 Thread Martin Sebor via Gcc-patches


On 11/18/20 4:36 PM, Jeff Law wrote:



On 11/18/20 3:41 PM, Martin Sebor via Gcc-patches wrote:

The access attribute handler doesn't check to make sure the mode
argument is an identifier and readily accepts string arguments
which are assumed to be the condensed internal representation
the user attribute is translated to.  This can cause all sorts
of unintended behavior when the user supplies a bogus string,
either by accident or in an effort to break things.

The attached patch tightens up the attribute handler to reject
strings and any other modes that aren't the expected indentifiers.
It distinguishes the internal strings by introducing a new flag,
ATTR_FLAG_INTERNAL, and calling decl_attributes() with it.

Martin

gcc-97879.diff

PR middle-end/97879 - ICE on invalid mode in attribute access

gcc/c-family/ChangeLog:

PR middle-end/97879
* c-attribs.c (handle_access_attribute): Handle ATTR_FLAG_INTERNAL.
Error out on invalid modes.

gcc/ChangeLog:

PR middle-end/97879
* tree-core.h (enum attribute_flags): Add ATTR_FLAG_INTERNAL.

gcc/testsuite/ChangeLog:

PR middle-end/97879
* gcc.dg/attr-access-3.c: New test.

OK
jeff



The patch was missing a corresponding change to the C front end.
After retesting (the initial patch didn't go through bootstrap
by accident) I committed r11-5209 with the missing bit added.

Martin

[PATCH] Darwin, libgfortran : Do not use environ directly from the library.

2020-11-20 Thread Iain Sandoe


Hi,

not sure if this is covered directly by my Darwin maintainer’s hat so …

-

On macOS / Darwin, the environ variable can be used directly in the
code of an executable, but cannot be used in the code of a shared
library (i.e. libgfortran.dylib)**

In such cases, the  function _NSGetEnviron should be called to get
the address of 'environ’.

tested on a number of Darwin platforms old and new, and on
x86_64/powerpc64-linux,

OK for master?
… and backports to open branches?
thanks
Iain

** we’ve been “getting away with it” because of the application of
dynamic_lookup in a rather broad manner, which I’m about to get rid
of.

—

libgfortran/ChangeLog:

* intrinsics/execute_command_line.c (environ): Use
_NSGetEnviron to get the environment pointer on Darwin.



---
 libgfortran/intrinsics/execute_command_line.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/libgfortran/intrinsics/execute_command_line.c  
b/libgfortran/intrinsics/execute_command_line.c

index 71d61a766ad..6d7b8fc658e 100644
--- a/libgfortran/intrinsics/execute_command_line.c
+++ b/libgfortran/intrinsics/execute_command_line.c
@@ -34,7 +34,12 @@ see the files COPYING3 and COPYING.RUNTIME  
respectively.  If not, see

 #endif
 #ifdef HAVE_POSIX_SPAWN
 #include 
+# ifdef __APPLE__
+#  include 
+#  define environ (*_NSGetEnviron ())
+# else
 extern char **environ;
+# endif
 #endif
 #if defined(HAVE_POSIX_SPAWN) || defined(HAVE_FORK)
 #include 
--
2.24.1

Re: [Ada] Build support units for 128-bit integer types on 64-bit platforms

2020-11-20 Thread Maciej W. Rozycki

On Wed, 18 Nov 2020, Eric Botcazou wrote:

> > that broke the build of an ada cross compiler targeting
> > powerpc64le-linux-gnu. target_cpu is powerpc64le which is not matched by
> > the Makefile logic.
> > 
> > Ok for the trunk?
> > 
> > PR ada/97859
> > * Makefile.rtl (powerpc% linux%): Also match powerpc64le cpu.
> 
> Yes, thanks.

 For the record: in a native `powerpc64le-linux-gnu' build despite the 
issue a functional compiler used to be built, except for the acats test 
suite reporting a catastrophic failure:

=== acats support ===
Generating support files... Failed to compile impbit
make: [.../gcc/ada/gcc-interface/Make-lang.in:958: check-acats] Error 1 
(ignored)

and then acats.log reporting:

.../gcc/gnatbind -x impbit.ali
error: "s-imgllli.ali" not found, "s-imgllli.ads" must be compiled
gnatmake: *** bind failed.
 Failed to compile impbit

I noticed that in VAX verification and came up with the same fix, and was 
about to post it (now that I completed the VAX effort and could catch up 
with other stuff) when I noticed it has been addressed already.

 Native `powerpc64le-linux-gnu' acats test results are now all-clean:

=== acats Summary ===
# of expected passes2320
# of unexpected failures0

(well, non-native acats verification sadly doesn't work anyway).

  Maciej

Re: [PATCH 1/2] NetBSD/libgcc: Check for TARGET_DL_ITERATE_PHDR in the unwinder

2020-11-20 Thread Maciej W. Rozycki

On Mon, 16 Nov 2020, Jeff Law wrote:

> > libgcc/
> > * unwind-dw2-fde-dip.c [__OpenBSD__ || __NetBSD__] 
> > (USE_PT_GNU_EH_FRAME): Do not define if !TARGET_DL_ITERATE_PHDR.
> OK

 Committed now, thank you for your review.

  Maciej

[committed v2 2/2] libada: Check for the presence of _SC_NPROCESSORS_ONLN

2020-11-20 Thread Maciej W. Rozycki

Check for the presence of _SC_NPROCESSORS_ONLN rather than using a list 
of OS-specific macros to decide whether to use `sysconf' like elsewhere 
across GCC sources, fixing a compilation error:

adaint.c: In function '__gnat_number_of_cpus':
adaint.c:2398:26: error: '_SC_NPROCESSORS_ONLN' undeclared (first use in this 
function)
 2398 |   cores = (int) sysconf (_SC_NPROCESSORS_ONLN);
  |  ^~~~
adaint.c:2398:26: note: each undeclared identifier is reported only once for 
each function it appears in

at least with with VAX/NetBSD 1.6.2.

gcc/ada/
* adaint.c (__gnat_number_of_cpus): Check for the presence of 
_SC_NPROCESSORS_ONLN rather than a list of OS-specific macros
to decide whether to use `sysconf'.
---
On Sun, 15 Nov 2020, Arnaud Charlet wrote:

> > NB we could probably replace the list of OS #ifdefs with just a check for 
> > _SC_NPROCESSORS_ONLN, making use of it automagically with any new OS that 
> > supports it, as from the length of the list has grown up to I gather the 
> > `sysconf' API for this variable has become a semi-established standard now 
> > even though not actually listed by the relevant standards.
> 
> Indeed, so a better patch would be to use
> 
> #if defined (_SC_NPROCESSORS_ONLN)
> 
> instead as you noted, so let's do that.

 This is what I have committed then, thank you for your review.

  Maciej
---
 gcc/ada/adaint.c |4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

Index: gcc/gcc/ada/adaint.c
===
--- gcc.orig/gcc/ada/adaint.c
+++ gcc/gcc/ada/adaint.c
@@ -2483,9 +2483,7 @@ __gnat_number_of_cpus (void)
 {
   int cores = 1;
 
-#if defined (__linux__) || defined (__sun__) || defined (_AIX) \
-  || defined (__APPLE__) || defined (__FreeBSD__) || defined (__OpenBSD__) \
-  || defined (__DragonFly__) || defined (__NetBSD__)
+#ifdef _SC_NPROCESSORS_ONLN
   cores = (int) sysconf (_SC_NPROCESSORS_ONLN);
 
 #elif defined (__QNX__)

Re: [PATCH] unshare expressions in attribute arguments

2020-11-20 Thread Martin Sebor via Gcc-patches


On 11/20/20 1:37 PM, Jakub Jelinek wrote:

On Fri, Nov 20, 2020 at 01:28:03PM -0700, Martin Sebor via Gcc-patches wrote:

On 11/20/20 12:29 PM, Marek Polacek wrote:

On Fri, Nov 20, 2020 at 12:00:58PM -0700, Martin Sebor via Gcc-patches wrote:

To detect a subset of VLA misuses, the C front associates the bounds
of VLAs in function argument lists with the corresponding variables
by implicitly adding an instance of attribute access to each function
declared to take VLAs with the bound expressions chained on the list
of attribute arguments.

Some of these expressions end up modified by the middle end, which
results in references to nonlocal variables (and perhaps other nodes)
used in these expression getting garbage collected.  A simple example
of this is described in pr97172.

By unsharing the bound expressions the patch below prevents this from
happening (it's not a fix for pr97172).

My understanding of the details of node sharing and garbage collection
in GCC is very limited (I didn't expect a tree to be garbage-collected
if it's still referenced by something).  Is this the right approach
to solving this problem?


ISTM that a more natural thing would be to use build_distinct_type_copy
to copy the type you're about to modify.


The get_parm_array_spec function doesn't modify a type.  It's called
from push_parm_decl() to build an "arg spec" attribute with the VLA
bounds as arguments.  push_parm_decl() then adds the attribute to
the function's PARM_DECL by calling decl_attributes().  When all of
the function's parameters have been processed the "arg specs" are
then extracted and added as an attribute access specification with
the VLA bounds added to the function declaration.


Guess it isn't that the trees would be GC collected, that can't happen if
they are referenced from reachable trees, but the thing is that the
gimplifier is destructive, it overwrites various trees as it is gimplifying
function bodies.  That is why the function bodies are normally unshared, but
that unsharing doesn't really walk function attributes.
On the other side, for VLAs unsharing is quite harmful, e.g. if there is
   int vla[foo ()];
then if it is unshared (except when it is a SAVE_EXPR that wouldn't be
unshared), then it could call the foo () function multiple times.
For VLA bounds in PARM_DECLs we are hopefully more restricted than that,
if it involves only other PARM_DECLs and constants and expressions composed
of them, the unsharing could be fine.


VLA parameter bounds can involve any other expressions, including
function calls.  It's those rather than other parameters that also
trigger the problem (at least in the test cases I've seen).

When/how would the unsharing cause the expression to be evaluated
multiple times?  And if/when it did, would simply wrapping the whole
expression in a SAVE_EXPR be the right way to avoid it or would it
need to be more involved than that?

Martin

Re: [PATCH] unshare expressions in attribute arguments

2020-11-20 Thread Jakub Jelinek via Gcc-patches

On Fri, Nov 20, 2020 at 02:30:43PM -0700, Martin Sebor wrote:
> VLA parameter bounds can involve any other expressions, including
> function calls.  It's those rather than other parameters that also
> trigger the problem (at least in the test cases I've seen).
> 
> When/how would the unsharing cause the expression to be evaluated
> multiple times?  And if/when it did, would simply wrapping the whole
> expression in a SAVE_EXPR be the right way to avoid it or would it
> need to be more involved than that?

Well, unshare_expr just doesn't unshare SAVE_EXPRs, it only ensures that
the trees inside of them aren't shared with something else (aka unshares the
subtrees the first time it sees the SAVE_EXPR), but doesn't unshare the
SAVE_EXPR node itself and doesn't walk children the second and following
time.
So, the question is whether you are creating the attributes before the
SAVE_EXPRs are added to the bounds or after it, and whether when evaluating
the (unshared) expressions in there you always place it after something
initialized those SAVE_EXPRs first.
The SAVE_EXPRs are essential, so that the functions aren't called multiple
times.

Jakub

[committed] avoid invalid redeclarations (PR 97861)

2020-11-20 Thread Martin Sebor via Gcc-patches


When checking for mismatches between the array forms of arguments
between the current and the new declaration of a function
warn_parm_array_mismatch() assumes that the new declaration is
valid and compatible with the current one.  When that's not so
the function crashes with a null pointer dereference.  In r11-5213
I have committed the attached fix to avoid this unsafe assumption.

Martin

commit 27c5416fc8a4c2b33a0d6b6a26da2518791e0464
Author: Martin Sebor 
Date:   Fri Nov 20 14:35:25 2020 -0700

PR middle-end/97861 - ICE on an invalid redeclaration of a function with attribute access

gcc/c-family/ChangeLog:
* c-warn.c (warn_parm_array_mismatch): Bail on invalid redeclarations
with fewer arguments.

gcc/testsuite/ChangeLog:
* gcc.dg/attr-access-4.c: New test.

diff --git a/gcc/c-family/c-warn.c b/gcc/c-family/c-warn.c
index 6d1f9a73e44..6d22a113ad0 100644
--- a/gcc/c-family/c-warn.c
+++ b/gcc/c-family/c-warn.c
@@ -3374,18 +3374,20 @@ warn_parm_array_mismatch (location_t origloc, tree fndecl, tree newparms)
   for (tree curp = curparms, newp = newparms; curp;
curp = TREE_CHAIN (curp), newp = TREE_CHAIN (newp), ++parmpos)
 {
+  if (!newp)
+	/* Bail on invalid redeclarations with fewer arguments.  */
+	return;
+
   /* Only check pointers and C++ references.  */
   tree newptype = TREE_TYPE (newp);
   if (!POINTER_TYPE_P (newptype))
 	continue;
 
-  {
-	/* Skip mismatches in __builtin_va_list that is commonly
-	   an array but that in declarations of built-ins decays
-	   to a pointer.  */
-	if (builtin && TREE_TYPE (newptype) == TREE_TYPE (va_list_type_node))
-	  continue;
-  }
+  /* Skip mismatches in __builtin_va_list that is commonly
+	 an array but that in declarations of built-ins decays
+	 to a pointer.  */
+  if (builtin && TREE_TYPE (newptype) == TREE_TYPE (va_list_type_node))
+	continue;
 
   /* Access specs for the argument on the current (previous) and
 	 new (to replace the current) declarations.  Either may be null,
diff --git a/gcc/testsuite/gcc.dg/attr-access-4.c b/gcc/testsuite/gcc.dg/attr-access-4.c
new file mode 100644
index 000..7a2870a0ee4
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/attr-access-4.c
@@ -0,0 +1,8 @@
+/* PR middle-end/97861 - ICE on an invalid redeclaration of a function
+   with attribute access
+   { dg-do compile }
+   { dg-options "-Wall" } */
+
+__attribute__ ((access (read_only, 2)))
+void f (int, int*);
+void f (int a) { }  // { dg-error "conflicting types for 'f'" }

[pushed] dwarf2: ICE with local class in unused function [PR97918]

2020-11-20 Thread Jason Merrill via Gcc-patches

Here, since we only mention bar, we never emit debug information for it.
But we do emit debug information for H::h, so we need to refer to the
debug info for bar::J even though there is no bar.  We deal with this
sort of thing in dwarf2out with the limbo_die_list; parentless dies like J
get attached to the CU at EOF.  But here, we were flushing the limbo list,
then generating the template argument DIE for H that refers to J, which
adds J to the limbo list, too late to be flushed.  So let's flush a little
later.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/ChangeLog:

PR c++/97918
* dwarf2out.c (dwarf2out_early_finish): flush_limbo_die_list
after gen_scheduled_generic_parms_dies.

gcc/testsuite/ChangeLog:

PR c++/97918
* g++.dg/debug/localclass2.C: New test.
---
 gcc/dwarf2out.c  |  6 +++---
 gcc/testsuite/g++.dg/debug/localclass2.C | 23 +++
 2 files changed, 26 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/debug/localclass2.C

diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index ea2a22a3042..07e1a921832 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -32169,13 +32169,13 @@ dwarf2out_early_finish (const char *filename)
  emit full debugging info for them.  */
   retry_incomplete_types ();
 
+  gen_scheduled_generic_parms_dies ();
+  gen_remaining_tmpl_value_param_die_attribute ();
+
   /* The point here is to flush out the limbo list so that it is empty
  and we don't need to stream it for LTO.  */
   flush_limbo_die_list ();
 
-  gen_scheduled_generic_parms_dies ();
-  gen_remaining_tmpl_value_param_die_attribute ();
-
   /* Add DW_AT_linkage_name for all deferred DIEs.  */
   for (limbo_die_node *node = deferred_asm_name; node; node = node->next)
 {
diff --git a/gcc/testsuite/g++.dg/debug/localclass2.C 
b/gcc/testsuite/g++.dg/debug/localclass2.C
new file mode 100644
index 000..3dd7cae322d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/debug/localclass2.C
@@ -0,0 +1,23 @@
+// PR c++/97918
+// { dg-additional-options "-g -O -flto" }
+// { dg-do compile { target c++11 } }
+
+namespace { class A {}; }
+class B {};
+template  struct H {
+  constexpr static unsigned h = 0;
+};
+
+template  A bar ()
+{
+  struct J {
+static void foo();
+  };
+  H();
+  return A ();
+}
+
+void fn ()
+{
+  bar;  // only mentions the function
+}

base-commit: 27c5416fc8a4c2b33a0d6b6a26da2518791e0464
-- 
2.18.4

Re: [PATCH] Additional small changes to support opaque modes

2020-11-20 Thread Aaron Sawdey via Gcc-patches

> On Nov 20, 2020, at 4:57 AM, Aaron Sawdey via Gcc-patches 
>  wrote:
> 
> 
>> On Nov 20, 2020, at 3:55 AM, Richard Sandiford  
>> wrote:
>> 
>> acsawdey--- via Gcc-patches  writes:
>>> @@ -16767,7 +16768,7 @@ loc_descriptor (rtx rtl, machine_mode mode,
>>>  break;
>>> 
>>>case CONST_INT:
>>> -  if (mode != VOIDmode && mode != BLKmode)
>>> +  if (mode != VOIDmode && mode != BLKmode && !OPAQUE_MODE_P (mode))
>>> {
>>>   int_mode = as_a  (mode);
>>>   loc_result = address_of_int_loc_descriptor (GET_MODE_SIZE (int_mode),
>> 
>> I realise I'm asking this about something that already appears to handle
>> BLKmode CONST_INTs (?!), but this is the one change in the patch I
>> struggled with.  Why do we see a CONST_INT that allegedly has an
>> opaque mode?  It feels like something has gone wrong further up the
>> call chain.
>> 
>> This might still be the expedient fix for whatever is happening,
>> but I think it deserves a comment at least.
>> 
>> The rest looks good to me FWIW.
>> 
>> Richard
> 
> I should look at this again — since I originally put that in, I switched the 
> target
> portion of what I’ve been doing to use an UNSPEC to remove all use of an
> opaque mode const_int from the rtf. This may not be needed any more. 

And as a final addendum — I was able to remove this and the problem I saw
before did not come back, probably because UNSPEC is used to hide all
constants so we never see any opaque type or mode constants, which is a
good thing.

Aaron Sawdey, Ph.D. saw...@linux.ibm.com
IBM Linux on POWER Toolchain

Re: [PATCH] unshare expressions in attribute arguments

2020-11-20 Thread Martin Sebor via Gcc-patches


On 11/20/20 2:41 PM, Jakub Jelinek wrote:

On Fri, Nov 20, 2020 at 02:30:43PM -0700, Martin Sebor wrote:

VLA parameter bounds can involve any other expressions, including
function calls.  It's those rather than other parameters that also
trigger the problem (at least in the test cases I've seen).

When/how would the unsharing cause the expression to be evaluated
multiple times?  And if/when it did, would simply wrapping the whole
expression in a SAVE_EXPR be the right way to avoid it or would it
need to be more involved than that?


Well, unshare_expr just doesn't unshare SAVE_EXPRs, it only ensures that
the trees inside of them aren't shared with something else (aka unshares the
subtrees the first time it sees the SAVE_EXPR), but doesn't unshare the
SAVE_EXPR node itself and doesn't walk children the second and following
time.
So, the question is whether you are creating the attributes before the
SAVE_EXPRs are added to the bounds or after it, and whether when evaluating
the (unshared) expressions in there you always place it after something
initialized those SAVE_EXPRs first.
The SAVE_EXPRs are essential, so that the functions aren't called multiple
times.


At the point the attribute is created there is no SAVE_EXPR.  So for
something like:

 int f (void);
 void g (int a[f () + 1]) { }

the bound is a PLUS_EXPR (CALL_EXPR (f), 1).

I don't do anything with the expression except put them on the chain
of arguments to the two attributes and print them in warnings.

Martin

Re: [PATCH] unshare expressions in attribute arguments

2020-11-20 Thread Jakub Jelinek via Gcc-patches

On Fri, Nov 20, 2020 at 02:54:34PM -0700, Martin Sebor wrote:
> At the point the attribute is created there is no SAVE_EXPR.  So for
> something like:
> 
>  int f (void);
>  void g (int a[f () + 1]) { }
> 
> the bound is a PLUS_EXPR (CALL_EXPR (f), 1).
> 
> I don't do anything with the expression except put them on the chain
> of arguments to the two attributes and print them in warnings.

So that likely means you are doing it too early.

Jakub

Re: [PATCH] c++: Add missing verify_type_context call [PR97904]

2020-11-20 Thread Jason Merrill via Gcc-patches


On 11/20/20 2:27 PM, Richard Sandiford wrote:

When adding the verify_type_context target hook, I'd missed
a site that needs to check an array element type.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK for master
and GCC 10 branch?


OK.


Thanks,
Richard


gcc/cp/
PR c++/97904
* pt.c (tsubst): Use verify_type_context to check the type
of an array element.

gcc/testsuite/
PR c++/97904
* g++.dg/ext/sve-sizeless-1.C: Add more template tests.
* g++.dg/ext/sve-sizeless-2.C: Likewise.
---
  gcc/cp/pt.c   |  4 +++
  gcc/testsuite/g++.dg/ext/sve-sizeless-1.C | 33 +--
  gcc/testsuite/g++.dg/ext/sve-sizeless-2.C | 33 +--
  3 files changed, 66 insertions(+), 4 deletions(-)

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 463b1c3a57d..89fec98ad67 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -15867,6 +15867,10 @@ tsubst (tree t, tree args, tsubst_flags_t complain, 
tree in_decl)
return error_mark_node;
  }
  
+	if (!verify_type_context (input_location, TCTX_ARRAY_ELEMENT, type,

+ !(complain & tf_error)))
+ return error_mark_node;
+
r = build_cplus_array_type (type, domain);
  
  	if (!valid_array_size_p (input_location, r, in_decl,

diff --git a/gcc/testsuite/g++.dg/ext/sve-sizeless-1.C 
b/gcc/testsuite/g++.dg/ext/sve-sizeless-1.C
index 7f829220c71..9f05ca5a855 100644
--- a/gcc/testsuite/g++.dg/ext/sve-sizeless-1.C
+++ b/gcc/testsuite/g++.dg/ext/sve-sizeless-1.C
@@ -72,10 +72,37 @@ template class templated_struct4;
  template struct templated_struct5 : T {}; // { dg-error {base 
type '[^']*' fails to be a struct or class type} }
  template class templated_struct5;
  
+template struct templated_struct6 { T x[N]; }; // { dg-error {array elements cannot have SVE type '(__SVInt8_t|svint8_t)'} }

+template class templated_struct6;
+
+template
+struct templated_struct7 {
+  static const int size = sizeof (T); // { dg-error {SVE type 
'(__SVInt8_t|svint8_t)' does not have a fixed size} }
+#if __cplusplus >= 201103L
+  static const int align = alignof (T); // { dg-error {SVE type '(__SVInt8_t|svint8_t)' 
does not have a defined alignment} "" { target c++11 } }
+#endif
+
+  void f1 (T (&)[2]); // { dg-error {array elements cannot have SVE type 
'(__SVInt8_t|svint8_t)'} }
+#if __cplusplus >= 201103L
+  auto f2 () -> decltype (new T); // { dg-error {cannot allocate objects with SVE type 
'(__SVInt8_t|svint8_t)'} "" { target c++11 } }
+  auto f3 (T *a) -> decltype (delete a); // { dg-error {cannot delete objects with SVE 
type '(__SVInt8_t|svint8_t)'} "" { target c++11 } }
+#else
+  void f2 () throw (T); // { dg-error {cannot throw or catch SVE type 
'(__SVInt8_t|svint8_t)'} "" { target c++98_only } }
+#endif
+};
+template class templated_struct7;
+
+template struct templated_struct8 { typedef int type; };
+
+template
+void sfinae_f1 (typename templated_struct8::type);
+template
+void sfinae_f1 (T &);
+
  #if __cplusplus >= 201103L
  template using typedef_sizeless1 = svint8_t;
  template using typedef_sizeless1 = svint8_t;
-template using array = T[2];
+template using array = T[2]; // { dg-error {array elements cannot have SVE 
type '(svint8_t|__SVInt8_t)'} "" { target c++11 } }
  #endif
  
  // Pointers to sizeless types.

@@ -119,7 +146,7 @@ statements (int n)
__alignof (ext_produce_sve_sc ()); // { dg-error {SVE type 'svint8_t' does 
not have a defined alignment} }
  
  #if __cplusplus >= 201103L

-  array foo = {}; // { dg-error {array elements cannot have SVE type 
'(svint8_t|__SVInt8_t)'} "" { target c++11 } }
+  array foo = {}; // { dg-message {required from here} "" { target 
c++11 } }
  #endif
  
// Initialization.

@@ -298,6 +325,8 @@ statements (int n)
thrower2 ();
  #endif
  
+  sfinae_f1 (sve_sc1);

+
// Use in traits.  Doesn't use static_assert so that tests work with
// earlier -std=s.
  
diff --git a/gcc/testsuite/g++.dg/ext/sve-sizeless-2.C b/gcc/testsuite/g++.dg/ext/sve-sizeless-2.C

index 40b65d37f8a..0b86d9e8217 100644
--- a/gcc/testsuite/g++.dg/ext/sve-sizeless-2.C
+++ b/gcc/testsuite/g++.dg/ext/sve-sizeless-2.C
@@ -72,10 +72,37 @@ template class templated_struct4;
  template struct templated_struct5 : T {}; // { dg-error {base 
type '[^']*' fails to be a struct or class type} }
  template class templated_struct5;
  
+template struct templated_struct6 { T x[N]; }; // { dg-error {array elements cannot have SVE type '(__SVInt8_t|svint8_t)'} }

+template class templated_struct6;
+
+template
+struct templated_struct7 {
+  static const int size = sizeof (T); // { dg-error {SVE type 
'(__SVInt8_t|svint8_t)' does not have a fixed size} }
+#if __cplusplus >= 201103L
+  static const int align = alignof (T); // { dg-error {SVE type '(__SVInt8_t|svint8_t)' 
does not have a defined alignment} "" { target c++11 } }
+#endif
+
+  void f1 (T (&)[2]); // { dg-error {array elements cannot have SVE type 
'(__SVInt8_t

Re: [PATCH] RISC-V: Always define MULTILIB_DEFAULTS

2020-11-20 Thread Jim Wilson

On Fri, Nov 20, 2020 at 12:34 AM Kito Cheng  wrote:

>  - Define MULTILIB_DEFAULTS can reduce the total number of multilib if
>the default arch and ABI are listed in the multilib config.
>

It looks like a good idea, but it doesn't seem to work.  A toolchain
configured without specifying arch/abi gives me

rohan:2149$ riscv64-unknown-elf-gcc --print-multi-lib
.;
rv32i/ilp32;@march=rv32i@mabi=ilp32
rv32im/ilp32;@march=rv32im@mabi=ilp32
rv32iac/ilp32;@march=rv32iac@mabi=ilp32
rv32imac/ilp32;@march=rv32imac@mabi=ilp32
rv32imafc/ilp32f;@march=rv32imafc@mabi=ilp32f
rv64imac/lp64;@march=rv64imac@mabi=lp64
rv64imafdc/lp64d;@march=rv64imafdc@mabi=lp64d

The rv64imafdc/lp64d is actually built twice, as it is the default and is
built explicitly.  But otherwise the list is correct.

If I configure a toolchain with the patch using --with-arch=rv32i
--with-abi=ilp32 --enable-multilib then I get

rohan:2151$ ./xgcc -B./ --print-multi-lib
.;
rv32imafc/ilp32f;@march=rv32imafc@mabi=ilp32f
rv64imac/lp64;@march=rv64imac@mabi=lp64
rv64imafdc/lp64d;@march=rv64imafdc@mabi=lp64d

and notice that 3 multilibs have mysteriously disappeared.  We have four
multilibs with -mabi=ilp32.  The rv32i/ilp32 multilib is gone as it is now
the default, but the other 3 should still be there.

The gcc/multilib.h file in the gcc build dir looks correct.  I think that
there is something wrong with the processing of the default args against
the multilib list.  if you have just one default arg, and an entry matches
then obviously you don't build it.  But if you have two default args, then
both must match before you choose to not build it, and I think the code has
never supported this case.

This works for the --with-multilib-list case because we currently only
support one arch/abi with this configure option, so there can be no
confusion with matching multiple default arguments.

Jim

Re: [PATCH] c++: Fix wrong error with constexpr destructor [PR97427]

2020-11-20 Thread Jason Merrill via Gcc-patches


On 11/19/20 8:21 PM, Marek Polacek wrote:

When I implemented the code to detect modifying const objects in
constexpr contexts, we couldn't have constexpr destructors, so I didn't
consider them.  But now we can and that caused a bogus error in this
testcase: [class.dtor]p5 says that "const and volatile semantics are not
applied on an object under destruction.  They stop being in effect when
the destructor for the most derived object starts." so we have to clear
the TREE_READONLY flag we set on the object after the constructors have
been called to mark it as no-longer-under-construction.  In the ~Foo
call it's now an object under destruction, so don't report those errors.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/10?


OK.


gcc/cp/ChangeLog:

PR c++/97427
* constexpr.c (cxx_set_object_constness): New function.
(cxx_eval_call_expression): Set new_obj for destructors too.
Call cxx_set_object_constness to set/unset TREE_READONLY of
the object under construction/destruction.

gcc/testsuite/ChangeLog:

PR c++/97427
* g++.dg/cpp2a/constexpr-dtor10.C: New test.
---
  gcc/cp/constexpr.c| 49 +--
  gcc/testsuite/g++.dg/cpp2a/constexpr-dtor10.C | 16 ++
  2 files changed, 49 insertions(+), 16 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/constexpr-dtor10.C

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 625410327b8..ef37b3043a5 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -2187,6 +2187,27 @@ cxx_eval_thunk_call (const constexpr_ctx *ctx, tree t, 
tree thunk_fndecl,
   non_constant_p, overflow_p);
  }
  
+/* If OBJECT is of const class type, evaluate it to a CONSTRUCTOR and set

+   its TREE_READONLY flag according to READONLY_P.  Used for constexpr
+   'tors to detect modifying const objects in a constexpr context.  */
+
+static void
+cxx_set_object_constness (const constexpr_ctx *ctx, tree object,
+ bool readonly_p, bool *non_constant_p,
+ bool *overflow_p)
+{
+  if (CLASS_TYPE_P (TREE_TYPE (object))
+  && CP_TYPE_CONST_P (TREE_TYPE (object)))
+{
+  /* Subobjects might not be stored in ctx->global->values but we
+can get its CONSTRUCTOR by evaluating *this.  */
+  tree e = cxx_eval_constant_expression (ctx, object, /*lval*/false,
+non_constant_p, overflow_p);
+  if (TREE_CODE (e) == CONSTRUCTOR && !*non_constant_p)
+   TREE_READONLY (e) = readonly_p;
+}
+}
+
  /* Subroutine of cxx_eval_constant_expression.
 Evaluate the call expression tree T in the context of OLD_CALL expression
 evaluation.  */
@@ -2515,11 +2536,11 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, 
tree t,
  
depth_ok = push_cx_call_context (t);
  
-  /* Remember the object we are constructing.  */

+  /* Remember the object we are constructing or destructing.  */
tree new_obj = NULL_TREE;
-  if (DECL_CONSTRUCTOR_P (fun))
+  if (DECL_CONSTRUCTOR_P (fun) || DECL_DESTRUCTOR_P (fun))
  {
-  /* In a constructor, it should be the first `this' argument.
+  /* In a cdtor, it should be the first `this' argument.
 At this point it has already been evaluated in the call
 to cxx_bind_parameters_in_call.  */
new_obj = TREE_VEC_ELT (new_call.bindings, 0);
@@ -2656,6 +2677,12 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree 
t,
  unsigned save_heap_alloc_count = ctx->global->heap_vars.length ();
  unsigned save_heap_dealloc_count = ctx->global->heap_dealloc_count;
  
+	  /* If this is a constexpr destructor, the object's const and volatile

+semantics are no longer in effect; see [class.dtor]p5.  */
+ if (new_obj && DECL_DESTRUCTOR_P (fun))
+   cxx_set_object_constness (ctx, new_obj, /*readonly_p=*/false,
+ non_constant_p, overflow_p);
+
  tree jump_target = NULL_TREE;
  cxx_eval_constant_expression (&ctx_with_save_exprs, body,
lval, non_constant_p, overflow_p,
@@ -2686,19 +2713,9 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree 
t,
 the object is no longer under construction, and its possible
 'const' semantics now apply.  Make a note of this fact by
 marking the CONSTRUCTOR TREE_READONLY.  */
- if (new_obj
- && CLASS_TYPE_P (TREE_TYPE (new_obj))
- && CP_TYPE_CONST_P (TREE_TYPE (new_obj)))
-   {
- /* Subobjects might not be stored in ctx->global->values but we
-can get its CONSTRUCTOR by evaluating *this.  */
- tree e = cxx_eval_constant_expression (ctx, new_obj,
-/*lval*/false,
-non_constant_p,
-

Re: [PATCH] c++: Allow template lambdas without lambda-declarator [PR97839]

2020-11-20 Thread Jason Merrill via Gcc-patches


On 11/17/20 1:05 PM, Marek Polacek wrote:

Our implementation of template lambdas incorrectly requires the optional
lambda-declarator.  This was probably required by an early draft of
generic lambdas, but now the production is [expr.prim.lambda.general]:

  lambda-expression:
 lambda-introducer lambda-declarator [opt] compound-statement
 lambda-introducer < template-parameter-list > requires-clause [opt]
  lambda-declarator [opt] compound-statement

Therefore, we should accept the following test.

Incidentally, I noticed we give a terrible diagnostic when the user uses
'mutable', but forgets to type '()' before it, which sounds like a common
mistake.  So it seems to me we should handle that specifically, rather
than to emit this:

lambda-generic8.C: In lambda function:
lambda-generic8.C:8:18: error: expected '{' before 'mutable'
 8 |   [] mutable {}.operator()();
   |  ^~~
lambda-generic8.C: In function 'int main()':
lambda-generic8.C:8:17: error: expected ';' before 'mutable'
 8 |   [] mutable {}.operator()();
   | ^~~~
   | ;
lambda-generic8.C:8:28: error: expected primary-expression before '.' token
 8 |   [] mutable {}.operator()();
   |^
lambda-generic8.C:8:40: error: expected primary-expression before 'int'
 8 |   [] mutable {}.operator()();
   |^~~

Is it okay to fix this in stage3?


Yes: this is a bugfix, not new functionality.


Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?


OK.


gcc/cp/ChangeLog:

PR c++/97839
* parser.c (cp_parser_lambda_declarator_opt): Don't require ().

gcc/testsuite/ChangeLog:

PR c++/97839
* g++.dg/cpp2a/lambda-generic8.C: New test.
---
  gcc/cp/parser.c  | 14 ++
  gcc/testsuite/g++.dg/cpp2a/lambda-generic8.C |  9 +
  2 files changed, 15 insertions(+), 8 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/lambda-generic8.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 42f705266bb..9f09c778c29 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -10604,6 +10604,8 @@ cp_parser_trait_expr (cp_parser* parser, enum rid 
keyword)
  
 lambda-expression:

   lambda-introducer lambda-declarator [opt] compound-statement
+ lambda-introducer < template-parameter-list > requires-clause [opt]
+   lambda-declarator [opt] compound-statement
  
 Returns a representation of the expression.  */
  
@@ -11061,13 +11063,11 @@ cp_parser_lambda_introducer (cp_parser* parser, tree lambda_expr)

  /* Parse the (optional) middle of a lambda expression.
  
 lambda-declarator:

- < template-parameter-list [opt] >
-   requires-clause [opt]
- ( parameter-declaration-clause [opt] )
-   attribute-specifier [opt]
+ ( parameter-declaration-clause )
 decl-specifier-seq [opt]
-   exception-specification [opt]
-   lambda-return-type-clause [opt]
+   noexcept-specifier [opt]
+   attribute-specifier-seq [opt]
+   trailing-return-type [opt]
 requires-clause [opt]
  
 LAMBDA_EXPR is the current representation of the lambda expression.  */

@@ -11217,8 +11217,6 @@ cp_parser_lambda_declarator_opt (cp_parser* parser, 
tree lambda_expr)
   trailing-return-type in case of decltype.  */
pop_bindings_and_leave_scope ();
  }
-  else if (template_param_list != NULL_TREE) // generate diagnostic
-cp_parser_require (parser, CPP_OPEN_PAREN, RT_OPEN_PAREN);
  
/* Create the function call operator.
  
diff --git a/gcc/testsuite/g++.dg/cpp2a/lambda-generic8.C b/gcc/testsuite/g++.dg/cpp2a/lambda-generic8.C

new file mode 100644
index 000..f3c3809b36d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/lambda-generic8.C
@@ -0,0 +1,9 @@
+// PR c++/97839
+// { dg-do compile { target c++20 } }
+// Test that a lambda with  doesn't require
+// a lambda-declarator.
+
+int main()
+{
+  []{}.operator()();
+}

base-commit: 8661f4faa875f361cd22a197774c1fa04cd0580b

Re: [PATCH] c++: Allow template lambdas without lambda-declarator [PR97839]

2020-11-20 Thread Jason Merrill via Gcc-patches


On 11/18/20 1:16 PM, Marek Polacek wrote:

On Tue, Nov 17, 2020 at 01:05:20PM -0500, Marek Polacek via Gcc-patches wrote:

Our implementation of template lambdas incorrectly requires the optional
lambda-declarator.  This was probably required by an early draft of
generic lambdas, but now the production is [expr.prim.lambda.general]:

  lambda-expression:
 lambda-introducer lambda-declarator [opt] compound-statement
 lambda-introducer < template-parameter-list > requires-clause [opt]
  lambda-declarator [opt] compound-statement

Therefore, we should accept the following test.

Incidentally, I noticed we give a terrible diagnostic when the user uses
'mutable', but forgets to type '()' before it, which sounds like a common
mistake.  So it seems to me we should handle that specifically, rather
than to emit this:


This might be necessary to handle  anyway.


Agreed.


lambda-generic8.C: In lambda function:
lambda-generic8.C:8:18: error: expected '{' before 'mutable'
 8 |   [] mutable {}.operator()();
   |  ^~~
lambda-generic8.C: In function 'int main()':
lambda-generic8.C:8:17: error: expected ';' before 'mutable'
 8 |   [] mutable {}.operator()();
   | ^~~~
   | ;
lambda-generic8.C:8:28: error: expected primary-expression before '.' token
 8 |   [] mutable {}.operator()();
   |^
lambda-generic8.C:8:40: error: expected primary-expression before 'int'
 8 |   [] mutable {}.operator()();
   |^~~

Is it okay to fix this in stage3?

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

gcc/cp/ChangeLog:

PR c++/97839
* parser.c (cp_parser_lambda_declarator_opt): Don't require ().

gcc/testsuite/ChangeLog:

PR c++/97839
* g++.dg/cpp2a/lambda-generic8.C: New test.
---
  gcc/cp/parser.c  | 14 ++
  gcc/testsuite/g++.dg/cpp2a/lambda-generic8.C |  9 +
  2 files changed, 15 insertions(+), 8 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/lambda-generic8.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 42f705266bb..9f09c778c29 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -10604,6 +10604,8 @@ cp_parser_trait_expr (cp_parser* parser, enum rid 
keyword)
  
 lambda-expression:

   lambda-introducer lambda-declarator [opt] compound-statement
+ lambda-introducer < template-parameter-list > requires-clause [opt]
+   lambda-declarator [opt] compound-statement
  
 Returns a representation of the expression.  */
  
@@ -11061,13 +11063,11 @@ cp_parser_lambda_introducer (cp_parser* parser, tree lambda_expr)

  /* Parse the (optional) middle of a lambda expression.
  
 lambda-declarator:

- < template-parameter-list [opt] >
-   requires-clause [opt]
- ( parameter-declaration-clause [opt] )
-   attribute-specifier [opt]
+ ( parameter-declaration-clause )
 decl-specifier-seq [opt]
-   exception-specification [opt]
-   lambda-return-type-clause [opt]
+   noexcept-specifier [opt]
+   attribute-specifier-seq [opt]
+   trailing-return-type [opt]
 requires-clause [opt]
  
 LAMBDA_EXPR is the current representation of the lambda expression.  */

@@ -11217,8 +11217,6 @@ cp_parser_lambda_declarator_opt (cp_parser* parser, 
tree lambda_expr)
   trailing-return-type in case of decltype.  */
pop_bindings_and_leave_scope ();
  }
-  else if (template_param_list != NULL_TREE) // generate diagnostic
-cp_parser_require (parser, CPP_OPEN_PAREN, RT_OPEN_PAREN);
  
/* Create the function call operator.
  
diff --git a/gcc/testsuite/g++.dg/cpp2a/lambda-generic8.C b/gcc/testsuite/g++.dg/cpp2a/lambda-generic8.C

new file mode 100644
index 000..f3c3809b36d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/lambda-generic8.C
@@ -0,0 +1,9 @@
+// PR c++/97839
+// { dg-do compile { target c++20 } }
+// Test that a lambda with  doesn't require
+// a lambda-declarator.
+
+int main()
+{
+  []{}.operator()();
+}

base-commit: 8661f4faa875f361cd22a197774c1fa04cd0580b
--
2.28.0



Marek

Re: [PATCH v2] c++: Extend -Wrange-loop-construct for binding-to-temp [PR94695]

2020-11-20 Thread Jason Merrill via Gcc-patches


On 11/16/20 6:52 PM, Marek Polacek wrote:

On Mon, Nov 16, 2020 at 05:02:14PM -0500, Jason Merrill via Gcc-patches wrote:

On 11/15/20 10:34 PM, Marek Polacek wrote:

[ This year's end-of-stage1 I'm working virtually from American Samoa. ]

This patch finishes the second half of -Wrange-loop-construct I promised
to implement: it warns when a loop variable in a range-based for-loop is
initialized with a value of a different type resulting in a copy.  For
instance:

int arr[10];
for (const double &x : arr) { ... }

where in every iteration we have to create and destroy a temporary value
of type double, to which we bind the reference.  This could negatively
impact performance.

As per Clang, this doesn't warn when the range returns a copy, hence the
glvalue_p check.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

gcc/ChangeLog:

PR c++/94695
* doc/invoke.texi: Update the -Wrange-loop-construct description.

gcc/cp/ChangeLog:

PR c++/94695
* parser.c (warn_for_range_copy): Warn when the loop variable is
initialized with a value of a different type resulting in a copy.

gcc/testsuite/ChangeLog:

PR c++/94695
* g++.dg/warn/Wrange-loop-construct2.C: New test.
---
   gcc/cp/parser.c   |  30 ++-
   gcc/doc/invoke.texi   |  18 +-
   .../g++.dg/warn/Wrange-loop-construct2.C  | 212 ++
   3 files changed, 256 insertions(+), 4 deletions(-)
   create mode 100644 gcc/testsuite/g++.dg/warn/Wrange-loop-construct2.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 42f705266bb..e96d0d94c76 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -12681,8 +12681,15 @@ do_range_for_auto_deduction (tree decl, tree 
range_expr)
for (const auto &x : range)
-   if this version doesn't make a copy.  DECL is the RANGE_DECL; EXPR is the
-   *__for_begin expression.
+   if this version doesn't make a copy.
+
+  This function also warns when the loop variable is initialized with
+  a value of a different type resulting in a copy:
+
+ int arr[10];
+ for (const double &x : arr)
+
+   DECL is the RANGE_DECL; EXPR is the *__for_begin expression.
  This function is never called when processing_template_decl is on.  */
   static void
@@ -12700,7 +12707,24 @@ warn_for_range_copy (tree decl, tree expr)
 if (TYPE_REF_P (type))
   {
-  /* TODO: Implement reference warnings.  */
+  if (!reference_compatible_p (non_reference (type), TREE_TYPE (expr))
+ && !reference_related_p (non_reference (type), TREE_TYPE (expr))


Is there a reason not to use ref_conv_binds_directly_p for this case as
well?


No reason.  reference_compatible_p also builds up a conversion, so it's not
cheaper than ref_conv_binds_directly_p as I mistakenly assumed.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
This patch finishes the second half of -Wrange-loop-construct I promised
to implement: it warns when a loop variable in a range-based for-loop is
initialized with a value of a different type resulting in a copy.  For
instance:

   int arr[10];
   for (const double &x : arr) { ... }

where in every iteration we have to create and destroy a temporary value
of type double, to which we bind the reference.  This could negatively
impact performance.

As per Clang, this doesn't warn when the range returns a copy, hence the
glvalue_p check.

gcc/ChangeLog:

PR c++/94695
* doc/invoke.texi: Update the -Wrange-loop-construct description.

gcc/cp/ChangeLog:

PR c++/94695
* parser.c (warn_for_range_copy): Warn when the loop variable is
initialized with a value of a different type resulting in a copy.

gcc/testsuite/ChangeLog:

PR c++/94695
* g++.dg/warn/Wrange-loop-construct2.C: New test.
---
  gcc/cp/parser.c   |  28 ++-
  gcc/doc/invoke.texi   |  18 +-
  .../g++.dg/warn/Wrange-loop-construct2.C  | 212 ++
  3 files changed, 254 insertions(+), 4 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/warn/Wrange-loop-construct2.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 42f705266bb..74584501d11 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -12681,8 +12681,15 @@ do_range_for_auto_deduction (tree decl, tree 
range_expr)
  
   for (const auto &x : range)
  
-   if this version doesn't make a copy.  DECL is the RANGE_DECL; EXPR is the

-   *__for_begin expression.
+   if this version doesn't make a copy.
+
+  This function also warns when the loop variable is initialized with
+  a value of a different type resulting in a copy:
+
+ int arr[10];
+ for (const double &x : arr)
+
+   DECL is the RANGE_DECL; EXPR is the *__for_begin expression.
 This function is never called when processing_template_decl is on.  */
  
  static void

@@ -12700,7 +12707,22 @@ warn_for_range_copy (tree decl, tree expr)
  
if (TYPE_

Re: [PATCH] c++: Reject identifier label in constexpr [PR97846]

2020-11-20 Thread Jason Merrill via Gcc-patches


On 11/16/20 9:58 PM, Marek Polacek wrote:

[dcl.constexpr]/3 says that the function-body of a constexpr function
shall not contain an identifier label, but we aren't enforcing that.

This patch implements that.  Of course, we can't reject artificial
labels.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

gcc/cp/ChangeLog:

PR c++/97846
* constexpr.c (potential_constant_expression_1): Reject
LABEL_EXPRs that use non-artifical LABEL_DECLs.

gcc/testsuite/ChangeLog:

PR c++/97846
* g++.dg/cpp1y/constexpr-label.C: New test.
---
  gcc/cp/constexpr.c   | 9 -
  gcc/testsuite/g++.dg/cpp1y/constexpr-label.C | 9 +
  2 files changed, 17 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp1y/constexpr-label.C

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index e6ab5eecd68..e4fbce14065 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -7484,7 +7484,6 @@ potential_constant_expression_1 (tree t, bool want_rval, 
bool strict, bool now,
  case OVERLOAD:
  case TEMPLATE_ID_EXPR:
  case LABEL_DECL:
-case LABEL_EXPR:
  case CASE_LABEL_EXPR:
  case PREDICT_EXPR:
  case CONST_DECL:
@@ -8393,6 +8392,14 @@ potential_constant_expression_1 (tree t, bool want_rval, 
bool strict, bool now,
return false;
}
  
+case LABEL_EXPR:

+  t = LABEL_EXPR_LABEL (t);
+  if (DECL_ARTIFICIAL (t) && DECL_IGNORED_P (t))


Is it useful to check DECL_IGNORED_P?  I'd think we want to allow any 
artificial labels, regardless of whether we're emitting debug info for 
them.  OK either way.



+   return true;
+  else if (flags & tf_error)
+   error_at (loc, "label definition is not a constant expression");
+  return false;
+
  case ANNOTATE_EXPR:
return RECUR (TREE_OPERAND (t, 0), rval);
  
diff --git a/gcc/testsuite/g++.dg/cpp1y/constexpr-label.C b/gcc/testsuite/g++.dg/cpp1y/constexpr-label.C

new file mode 100644
index 000..a2d113c186f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/constexpr-label.C
@@ -0,0 +1,9 @@
+// PR c++/97846
+// { dg-do compile { target c++14 } }
+
+constexpr int
+f ()
+{
+x: // { dg-error "label definition is not a constant expression" }
+  return 42;
+}

base-commit: 814e016318646d06b1662219cc716d502b76d8ce

Re: [PATCH] c++: Fix ICE-on-invalid with -Wvexing-parse [PR97881]

2020-11-20 Thread Jason Merrill via Gcc-patches


On 11/17/20 2:32 PM, Marek Polacek wrote:

This invalid (?) code broke my assumption that if decl_specifiers->type
is null, there must be any type-specifiers.  Turn the assert into an if
to fix this crash.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

gcc/cp/ChangeLog:

PR c++/97881
* parser.c (warn_about_ambiguous_parse): Only assume "int" if we
actually saw any type-specifiers.

gcc/testsuite/ChangeLog:

PR c++/97881
* g++.dg/warn/Wvexing-parse9.C: New test.
---
  gcc/cp/parser.c| 11 +--
  gcc/testsuite/g++.dg/warn/Wvexing-parse9.C |  8 
  2 files changed, 13 insertions(+), 6 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/warn/Wvexing-parse9.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index b7ef259b048..7a6bf4ad2cf 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -20792,13 +20792,12 @@ warn_about_ambiguous_parse (const 
cp_decl_specifier_seq *decl_specifiers,
if (same_type_p (type, void_type_node))
return;
  }
+  else if (decl_specifiers->any_type_specifiers_p)
+/* Code like long f(); will have null ->type.  If we have any
+   type-specifiers, pretend we've seen int.  */
+type = integer_type_node;
else
-{
-  /* Code like long f(); will have null ->type.  If we have any
-type-specifiers, pretend we've seen int.  */
-  gcc_checking_assert (decl_specifiers->any_type_specifiers_p);
-  type = integer_type_node;
-}
+return;
  
auto_diagnostic_group d;

location_t loc = declarator->u.function.parens_loc;
diff --git a/gcc/testsuite/g++.dg/warn/Wvexing-parse9.C 
b/gcc/testsuite/g++.dg/warn/Wvexing-parse9.C
new file mode 100644
index 000..dc4198d6c5e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wvexing-parse9.C
@@ -0,0 +1,8 @@
+// PR c++/97881
+// { dg-do compile }
+
+void
+cb ()
+{
+  volatile _Atomic (int) a1; // { dg-error "expected initializer" }


I'm not sure it's useful to test for this particular error, since a 
missing initializer isn't the problem with this declaration.  OK either way.



+}

base-commit: a5f9c27bfc4417224e332392bb81a2d733b2b5bf

Re: [PATCH] Objective-C++ : Allow prefix attrs on linkage specs.

2020-11-20 Thread Jason Merrill via Gcc-patches


On 11/7/20 10:11 AM, Iain Sandoe wrote:

Hi,

For Objective-C++/C, we cater for the possibility that a class interface
(@interface) might be preceded by prefix attributes.  In the case of
Objective-C++, the reference implementation (a.k.a. clang) also allows
(and combines) prefix attributes that precede a linkage specification
(but only on a single decl).

Some discussion with Nathan here:
https://gcc.gnu.org/pipermail/gcc/2020-October/234057.html

The upshot is that clang’s behaviour is inconsistent (I can file a bug,
I guess) - but since what is “well-formed” for Objective-C is defined in
reality by what clang accepts - there is a body of code out there that
depends on the behaviour (some variant of Hyrum’s law, or corollary
to it, perhaps?).

Inability to parse code including these patterns is blocking progress
in modernising GNU Objective-C.. so I need to find a way forward.

The compromise made here is to accept the sequence when parsing
for Objective-C++, and to warn** that the attributes are discarded otherwise.

This seems to me to be an improvement in diagnostics for regular C++
(since it now says something pertinent to the actual problem and does
the 'same as usual' when encountering an unhandled attribute).

Tested across the Darwin patch, and on x86_64-linux-gnu,
OK for master?
thanks
Iain

** trivially, that could be an error instead - but it seems we usually warn
for unrecognised attributes.

—— commit message

For Objective-C++, this combines prefix attributes from before and
after top level linkage specs.  The "reference implementation" for
Objective-C++ allows this, and system headers depend on it.

e.g.

__attribute__((__deprecated__))
extern "C" __attribute__((__visibility__("default")))
@interface MyClass
...
@end

Would consider the list of prefix attributes to the interface for
MyClass to include both the visibility and deprecated ones.

When we are compiling regular C++, this emits a warning and discards
any prefix attributes before a linkage spec.

gcc/cp/ChangeLog:

* parser.c (cp_parser_declaration): Unless we are compiling for
Ojective-C++, warn about and discard any attributes that prefix
a linkage specification.
---
  gcc/cp/parser.c | 71 +++--
  1 file changed, 57 insertions(+), 14 deletions(-)

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index c4c672efa09..320d151c060 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -2187,7 +2187,7 @@ static void cp_parser_already_scoped_statement
  static void cp_parser_declaration_seq_opt
(cp_parser *);
  static void cp_parser_declaration
-  (cp_parser *);
+  (cp_parser *, tree);
  static void cp_parser_toplevel_declaration
(cp_parser *);
  static void cp_parser_block_declaration
@@ -2238,7 +2238,7 @@ static tree cp_parser_alias_declaration
  static void cp_parser_asm_definition
(cp_parser *);
  static void cp_parser_linkage_specification
-  (cp_parser *);
+  (cp_parser *, tree);
  static void cp_parser_static_assert
(cp_parser *, bool);
  static tree cp_parser_decltype
@@ -13496,7 +13496,7 @@ cp_parser_declaration_seq_opt (cp_parser* parser)
__extension__ declaration */
  
  static void

-cp_parser_declaration (cp_parser* parser)
+cp_parser_declaration (cp_parser* parser, tree prefix_attrs)
  {
int saved_pedantic;
  
@@ -13504,7 +13504,7 @@ cp_parser_declaration (cp_parser* parser)

if (cp_parser_extension_opt (parser, &saved_pedantic))
  {
/* Parse the qualified declaration.  */
-  cp_parser_declaration (parser);
+  cp_parser_declaration (parser, prefix_attrs);
/* Restore the PEDANTIC flag.  */
pedantic = saved_pedantic;
  
@@ -13521,11 +13521,50 @@ cp_parser_declaration (cp_parser* parser)
  
tree attributes = NULL_TREE;
  
+  /* Conditionally, allow attributes to precede a linkage specification.  */

+  if (token1->keyword == RID_ATTRIBUTE)
+{
+  cp_lexer_save_tokens (parser->lexer);
+  attributes = cp_parser_attributes_opt (parser);
+  gcc_checking_assert (attributes);
+  cp_token *t1 = cp_lexer_peek_token (parser->lexer);
+  cp_token *t2 = (t1->type == CPP_EOF
+ ? t1 : cp_lexer_peek_nth_token (parser->lexer, 2));
+  if (t1->keyword == RID_EXTERN
+ && cp_parser_is_pure_string_literal (t2))
+   {
+ cp_lexer_commit_tokens (parser->lexer);
+ /* We might have already been here.  */
+ if (!c_dialect_objc ())
+   {
+ warning_at (token1->location, OPT_Wattributes, "attributes are"
+ " only permitted in this position for Objective-C++,"
+ " ignored");


It would be nice for the warning to suggest where to move the attribute, 
rather than just say that this location is bad.



+ attributes = NULL_TREE;
+   }
+ token1 = t1;
+ token2 = t2;
+   }
+  else
+   {
+ cp_lexer_rollback_tokens (parser->l

Re: [PATCH] unshare expressions in attribute arguments

2020-11-20 Thread Martin Sebor via Gcc-patches


On 11/20/20 2:57 PM, Jakub Jelinek wrote:

On Fri, Nov 20, 2020 at 02:54:34PM -0700, Martin Sebor wrote:

At the point the attribute is created there is no SAVE_EXPR.  So for
something like:

  int f (void);
  void g (int a[f () + 1]) { }

the bound is a PLUS_EXPR (CALL_EXPR (f), 1).

I don't do anything with the expression except put them on the chain
of arguments to the two attributes and print them in warnings.


So that likely means you are doing it too early.


The bounds are added to attribute "arg spec" for each param in
push_parm_decl.  I think that's both as early and (except maybe
in function definitions) as late as can be.  After that point,
the association between a VLA parameter and its most significant
bound is lost.

For example, in:

  void f (int n, int A[n], int B[foo () + 1]);

A and B become pointers in push_parm_decl() (in grokdeclarator()
called from it) and there's no way that I know to retrieve
the bounds at a later point.  AFAICT, they're gone unless
the function is being defined.  Is there another/better point
to extract this association that escapes me?

The VLA bounds are evaluated in function definitions so there
must be a point where that's done.  I don't know where that
happens but unless at that point the most significant bound
is still associated with the param (AFAIK, it never really
is at the tree level) it wouldn't help me.

Martin

Re: [PATCH] libstdc++: Add C++2a synchronization support

2020-11-20 Thread Thomas Rodgers

Tested x86_64-pc-linux-gnu, committed.

> On Oct 27, 2020, at 3:23 AM, Jonathan Wakely  wrote:
> 
> On 26/10/20 14:48 -0700, Thomas Rodgers wrote:
>> +#include 
>> +
>> +#if __has_include()
>> +#define _GLIBCXX_HAVE_POSIX_SEMAPHORE 1
>> +#include 
> 
> It occurs to me now that this check probably isn't robust enough. For
> any POSIX system it's probably safe to assume that  means
> the POSIX header and so sem_t is available.
> 
> But on non-POSIX systems there could be some other, unrelated header
> called  in the include paths that the user is compiling
> this header with. It's not inconceivable that the user's own project
> or some third party lib could provide a file called semaphore.h, which
> wouldn't define sem_t, sem_init etc.
> 
> It's OK for now, but we should revisit this and add an autoconf check
> for sem_init etc. to check at build time whether we've got POSIX
> semaphores available or not.
> 
> Please add a "FIXME: replace this with an autoconf check" comment
> here.
> 
> OK for trunk with that change, thanks.
>

1 2 >

1 - 100 of 137 matches

Mail list logo