Re: [ADA PATCH] Fix locales.c iso_3166 bug

2019-10-27 Thread Arnaud Charlet
> My script to check for PR79475-like issues discovered what looks like a bug
> in the Ada FE.  I actually have no idea how it can work properly for any
> entries after the last US entry or "United-States", because it will only
> match United-StatesUY to US or UZ to Uruguay etc., rather than
> United-States to US, Uruguay to UY, Uzbekistan to UZ etc.
> 
> Ok for trunk if it passes bootstrap/regtested on x86_64-linux?
> 
> 2019-10-24  Jakub Jelinek  
> 
>* locales.c (iso_3166): Add missing comma after "United-States".

Ok, thanks


Re: [PATCH] add __has_builtin (PR 66970)

2019-10-27 Thread Jeff Law
On 10/1/19 11:16 AM, Martin Sebor wrote:
> Attached is an implementation of the __has_builtin special
> preprocessor operator/macro analogous to __has_attribute and
> (hopefully) compatible with the synonymous Clang feature (I
> couldn't actually find tests for it in the Clang test suite
> but if someone points me at them I'll verify it).
> 
> Tested on x86_64-linux.
> 
> Martin
> 
> PS I couldn't find an existing API to test whether a reserved
> symbol like __builtin_offsetof is a function-like built-in so
> I hardwired the tests for C and C++ into the new names_builtin_p
> functions.  I don't like this very much because the next time
> such an operator is added there is nothing to remind us to update
> the functions.  Adding a flag to the c_common_reswords array would
> solve the problem but at the expense of a linear search through
> it.  Does anyone have a suggestion for how to do this better?
> 
> gcc-66970.diff
> 
> PR c/66970 - Add __has_builtin() macro
> 
> gcc/ChangeLog:
>   PR c/66970
>   * doc/cpp.texi (__has_builtin): Document.
>   * doc/extend.texi (__builtin_frob_return_addr): Correct spelling.
> 
> gcc/c/ChangeLog:
> 
>   PR c/66970
>   * c-decl.c (names_builtin_p): Define a new function.
> 
> gcc/c-family/ChangeLog:
> 
>   PR c/66970
>   * c-common.c (c_common_nodes_and_builtins): Call c_define_builtins
>   even when only preprocessing.
>   * c-common.h (names_builtin_p): Declare new function.
>   * c-lex.c (init_c_lex): Set has_builtin.
>   (c_common_has_builtin): Define a new function.
>   * c-ppoutput.c (init_pp_output): Set has_builtin.
> 
> gcc/cp/ChangeLog:
> 
>   PR c/66970
>   * cp-objcp-common.c (names_builtin_p): Define new function.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR c/66970
>   * c-c++-common/cpp/has-builtin-2.c: New test.
>   * c-c++-common/cpp/has-builtin-3.c: New test.
>   * c-c++-common/cpp/has-builtin.c: New test.
> 
> libcpp/ChangeLog:
> 
>   PR c/66970
>   * include/cpplib.h (cpp_builtin_type): Add BT_HAS_BUILTIN.
>   (cpp_callbacks::has_builtin): Declare new member.
>   * init.c (builtin_array): Add an element for BT_HAS_BUILTIN.
>   (cpp_init_special_builtins): Handle BT_HAS_BUILTIN.
>   * macro.c (_cpp_builtin_macro_text): Same.
>   * traditional.c: Same.
OK
jeff



Re: [WIP PATCH] add object access attributes (PR 83859)

2019-10-27 Thread Jeff Law
On 9/29/19 1:51 PM, Martin Sebor wrote:
> -Wstringop-overflow detects a subset of past-the-end read and write
> accesses by built-in functions such as memcpy and strcpy.  It relies
> on the functions' effects the knowledge of which is hardwired into
> GCC.  Although it's possible for users to create wrappers for their
> own functions to detect similar problems, it's quite cumbersome and
> so only lightly used outside system libraries like Glibc.  Even Glibc
> only checks for buffer overflow and not for reading past the end.
> 
> PR 83859 asks to expose the same checking that GCC does natively for
> built-in calls via a function attribute that associates a pointer
> argument with the size argument, such as:
> 
>   __attribute__((buffer_size (1, 2))) void
>   f (char* dst, size_t dstsize);
> 
> The attached patch is my initial stab at providing this feature by
> introducing three new attributes:
> 
>   * read_only (ptr-argno, size-argno)
>   * read_only (ptr-argno, size-argno)
>   * read_write (ptr-argno, size-argno)
> 
> As requested, the attributes associate a pointer parameter to
> a function with a size parameter.  In addition, they also specify
> how the function accesses the object the pointer points to: either
> it only reads from it, or it only writes to it, or it does both.
> 
> Besides enabling the same buffer overflow detection as for built-in
> string functions they also let GCC issue -Wuninitialized warnings
> for uninitialized objects passed to read-only functions by reference,
> and -Wunused-but-set warnings for objects passed to write-only
> functions that are otherwise unused (PR 80806).  The -Wununitialized
> part is done. The -Wunused-but-set detection is implemented only in
> the C FE and not yet in C++.
> 
> Besides the diagnostic improvements above the attributes also open
> up optimization opportunities such as DCE.  I'm still working on this
> and so it's not yet part of the initial patch.
> 
> I plan to finish the patch for GCC 10 but I don't expect to have
> the time to start taking advantage of the attributes for optimization
> until GCC 11.
> 
> Besides regression testing on x86_64-linux, I also tested the patch
> by compiling Binutils/GDB, Glibc, and the Linux kernel with it.  It
> found no new problems but caused a handful of -Wunused-but-set-variable
> false positives due to an outstanding bug in the C front-end introduced
> by the patch that I still need to fix.
> 
> Martin
> 
> gcc-80806.diff
> 
> PR c/80806 - gcc does not warn if local array is memset only
> PR middle-end/83859 - attribute to associate buffer and its size
> 
> gcc/ChangeLog:
> 
>   PR c/80806
>   PR middle-end/83859
>   * builtin-attrs.def (ATTR_NO_SIDE_EFFECT): New.
>   (ATTR_READ_ONLY, ATTR_READ_WRITE, ATTR_WRITE_ONLY): New.
>   (ATTR_NOTHROW_WRONLY1_LEAF, ATTR_NOTHROW_WRONLY1_2_LEAF): New.
>   (ATTR_NOTHROW_WRONLY1_3_LEAF, ATTR_NOTHROW_WRONLY2_3_LEAF): New.
>   (ATTR_RET1_NOTHROW_WRONLY1_LEAF, ATTR_RET1_NOTHROW_WRONLY1_3_LEAF): New.
>   (ATTR_RET1_NOTHROW_NONNULL_RDONLY2_LEAF): New.
>   (ATTR_RET1_NOTHROW_NONNULL_RDWR1_RDONLY2_LEAF): New.
>   (ATTR_RET1_NOTHROW_WRONLY1_3_RDONLY2_3_LEAF): New.
>   (ATTR_RET1_NOTHROW_WRONLY1_RDONLY2_LEAF): New.
>   (ATTR_RET1_NOTHROW_WRONLY1_3_RDONLY2_LEAF): New.
>   (ATTR_MALLOC_NOTHROW_NONNULL_RDONLY1_LEAF): New.
>   (ATTR_PURE_NOTHROW_NONNULL_RDONLY1_LEAF): New.
>   (ATTR_PURE_NOTHROW_NONNULL_RDONLY1_RDONLY2_LEAF): New.
>   (ATTR_RETNONNULL_RDONLY2_3_NOTHROW_LEAF): New.
>   (ATTR_RETNONNULL_WRONLY1_3_RDONLY2_3_NOTHROW_LEAF): New.
>   (ATTR_PURE_NOTHROW_NONNULL_RDONLY1_3_LEAF): New.
>   (ATTR_PURE_NOTHROW_NONNULL_RDONLY1_2_3_LEAF): New.
>   (ATTR_RETNONNULL_RDONLY2_NOTHROW_LEAF): New.
>   (ATTR_RETNONNULL_WRONLY1_RDONLY2_NOTHROW_LEAF): New.
>   (ATTR_RETNONNULL_WRONLY1_3_RDONLY2_NOTHROW_LEAF): New.
>   * builtins.c (check_access): Make extern.  Consistently set
>   the no-warning bit after issuing a warning.
>   * builtins.h (check_access): Declare.
>   * builtins.def (bcopy, bzero, index, memchr, memcmp, memcpy): Add
>   read_only and write_only attributes.
>   (memset, rindex, stpcpy, stpncpy, strcasecmp, strcat): Same.
>   (strchr, strcmp, strcpy, strcspn, strdup, strndup, strlen): Same.
>   (strncasecmp, strncat, strncmp, strncpy, strrchr, strspn, strstr): Same.
>   (free, __memcpy_chk, __memmove_chk, __memset_chk): Same.
>   (__strcpy_chk, __strncpy_chk): Same.
>   * calls.c (rdwr_access_hash): New type.
>   (rdwr_map): Same.
>   (init_attr_rdwr_indices): New function.
>   (maybe_warn_rdwr_sizes): Same.
>   (initialize_argument_information): Call init_attr_rdwr_indices.
>   Call maybe_warn_rdwr_sizes.
>   * doc/extend.texi (no_side_effect): Document new attribute.
>   (read_only, write_only, read_write): Same.
>   * tree-ssa-uninit.c (maybe_warn_uninit_accesss): New functions.
>   (warn_uninitialized_v

Re: Add a simulate_builin_function_decl langhook

2019-10-27 Thread Jeff Law
On 10/5/19 5:29 AM, Richard Sandiford wrote:
> 
> Sure.  This message is going to go to the other extreme, sorry, but I'm
> not sure which part will be the most convincing (if any).
No worries.  Worst case going to the other extreme is I have to read it
more than once after nodding off halfway through :-)

Seriously though.  THanks for the more expansive write up.


> 
> I guess I should start by saying that SVE intrinsics have three types of
> predication (zeroing, merging, and "don't care") that combine with the usual
> type suffixes seen in arm_neon.h.  This gives 3,736 functions for baseline
> SVE (SVE2 adds more).
Ugh.  That'd be a lot of stuff to add into a header.  As you noted
later, there's significant costs to doing so.


> 
> The vast majority of those functions can't be open-coded using the
> core parts of C and C++, even with GNU extensions.  Some could perhaps
> be coded using new library extensions, but that just shifts the question
> from "how do we implement this feature in arm_sve.h?" to "how we implement
> this feature in the library extension?".
True.  But one could ask if those extensions are something that we're
likely to need beyond what you're doing now.  But I don't necessarily
think that would override the desire not to have so much stuff in the
header.

I'm guessing that even though you can't describe what you need at the
C/C++ level, but you can describe at least some of what you want in
gimple/rtl?  Otherwise I'm not sure what you get above and beyond asms.

> 
> So that leaves us using built-in functions for almost all of those 3,736
> functions.  With the traditional approach, the target would need to
> register the functions at start-up and then define the header file in
> terms of them.
Yup.  And there's a cost you missed -- those things tend to end up in
the debugging output as well.  That's caused problems with system tools
in the past.


> 
> There are two approaches to doing that.  One is to define the built-in
> functions under their header file name so that they become active once
> a prototype is seen.  But that's only appropriate for functions like
> printf that have linkage.  The arm_sve.h functions don't have linkage
> and there's a chance that non-SVE code could be using the same names for
> something else (perhaps even with the same prototype, in the case of
> things like
> 
>   uint64_t svcntb (void);
> 
> that don't mention SVE types).
> 
> The other alternative is to define builtins in the "__builtin_"
> namespace and wrap them in inline wrappers, which I think is what
> most intrinsics header files do.  E.g., from arm_neon.h:
> 
> __extension__ extern __inline float64x2_t
> __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> vabsq_f64 (float64x2_t __a)
> {
>   return __builtin_aarch64_absv2df (__a);
> }
> 
> But that's 6 lines per function.  Counting a blank line between
> each one, we'd end up with a header file of at least 26,151 lines.
> (OK, so arm_neon.h is already longer than that.  But with SVE2 and with
> other considerations mentioned below, arm_sve.h could easily end up into
> 6 figures this way.)
Yea, that's kind of what I would expect for exposing this stuff.  But I
didn't realize how many of these you were going to need.

[ ... ]


> 
> It's very hard to maintain header files like that by hand without
> introducing errors, such as forgetting to put the arguments safely
> in the implementation namespace ("__a" rather than "a" etc.).  Kyrill
> fixed some arm_neon.h instances of this the other week.  And using macros
> to handle common patterns just makes the error messages worse, since the
> macros then show up in error backtraces.
Yup.  We've seen this across multiple targets.  Y'all aren't alone here.

> 
> An alternative to maintaining the header file by hand is to generate
> it via a script.  Ideally the script would use the same metadata as
> the compiler itself uses when registering the built-in functions.
> But this means writing two pieces of code to process the metadata,
> one to generate text for the inline wrappers and one to register the
> built-ins.  And we still end up with the same very long header file.
Yup.  It's not ideal.

> 
> A more fundamental problem with inline wrappers is that they make it
> harder to honour the spec for constant arguments.  If an instruction
> requires a constant operand, Arm has traditionally required the
> associated intrinsic argument to be an integer constant expression
> (in the C and C++ sense).  GCC has tended to fudge this and instead only
> requires an integer constant at expand time, after inlining and constant
> propagation have taken place.  But this means that all sorts of other
> optimisations have happened too, and so what's constant at expand time
> wasn't necessarily constant at the language level.
> 
> Admittedly some people like that behaviour :-), but it means that what's
> acceptable depends on the vagaries compiler optimisation.  It also means
> that code is often n

[PATCH v4 3/7] libstdc++ futex: Support waiting on std::chrono::steady_clock directly

2019-10-27 Thread Mike Crowe
The user-visible effect of this change is for std::future::wait_until to
use CLOCK_MONOTONIC when passed a timeout of std::chrono::steady_clock
type.  This makes it immune to any changes made to the system clock
CLOCK_REALTIME.

Add an overload of __atomic_futex_unsigned::_M_load_and_text_until_impl
that accepts a std::chrono::steady_clock, and correctly passes this through
to __atomic_futex_unsigned_base::_M_futex_wait_until_steady which uses
CLOCK_MONOTONIC for the timeout within the futex system call.  These
functions are mostly just copies of the std::chrono::system_clock versions
with small tweaks.

Prior to this commit, a std::chrono::steady timeout would be converted via
std::chrono::system_clock which risks reducing or increasing the timeout if
someone changes CLOCK_REALTIME whilst the wait is happening.  (The commit
immediately prior to this one increases the window of opportunity for that
from a short period during the calculation of a relative timeout, to the
entire duration of the wait.)

FUTEX_WAIT_BITSET was added in kernel v2.6.25.  If futex reports ENOSYS to
indicate that this operation is not supported then the code falls back to
using clock_gettime(2) to calculate a relative time to wait for.

I believe that I've added this functionality in a way that it doesn't break
ABI compatibility, but that has made it more verbose and less type safe.  I
believe that it would be better to maintain the timeout as an instance of
the correct clock type all the way down to a single _M_futex_wait_until
function with an overload for each clock.  The current scheme of separating
out the seconds and nanoseconds early risks accidentally calling the wait
function for the wrong clock.  Unfortunately, doing this would break code
that compiled against the old header.

* libstdc++-v3/config/abi/pre/gnu.ver: Update for addition of
  __atomic_futex_unsigned_base::_M_futex_wait_until_steady.

* libstdc++-v3/include/bits/atomic_futex.h
  (__atomic_futex_unsigned_base): Add comments to clarify that
  _M_futex_wait_until _M_load_and_test_until use CLOCK_REALTIME.
  Declare new _M_futex_wait_until_steady and
  _M_load_and_text_until_steady methods that use CLOCK_MONOTONIC.
  Add _M_load_and_test_until_impl and _M_load_when_equal_until
  overloads that accept a steady_clock time_point and use these new
  methods.

* libstdc++-v3/src/c++11/futex.cc: Include headers required for
clock_gettime. Add futex_clock_monotonic_flag constant to tell
futex to use CLOCK_MONOTONIC to match the existing
futex_clock_realtime_flag.  Add futex_clock_monotonic_unavailable
to store the result of trying to use
CLOCK_MONOTONIC. 
(__atomic_futex_unsigned_base::_M_futex_wait_until_steady):
Add new variant of _M_futex_wait_until that uses CLOCK_MONOTONIC to
support waiting using steady_clock.
---
 libstdc++-v3/config/abi/pre/gnu.ver  | 10 +--
 libstdc++-v3/include/bits/atomic_futex.h | 67 +++-
 libstdc++-v3/src/c++11/futex.cc  | 82 +-
 3 files changed, 154 insertions(+), 5 deletions(-)

diff --git a/libstdc++-v3/config/abi/pre/gnu.ver 
b/libstdc++-v3/config/abi/pre/gnu.ver
index e61fbe0..26492bc 100644
--- a/libstdc++-v3/config/abi/pre/gnu.ver
+++ b/libstdc++-v3/config/abi/pre/gnu.ver
@@ -1916,10 +1916,9 @@ GLIBCXX_3.4.21 {
 _ZNSt7codecvtID[is]c*;
 _ZT[ISV]St7codecvtID[is]c*E;
 
-extern "C++"
-{
-  std::__atomic_futex_unsigned_base*;
-};
+# std::__atomic_futex_unsigned_base members
+_ZNSt28__atomic_futex_unsigned_base19_M_futex_notify_all*;
+_ZNSt28__atomic_futex_unsigned_base19_M_futex_wait_until*;
 
 # codecvt_utf8 etc.
 _ZNKSt19__codecvt_utf8_base*;
@@ -2291,6 +2290,9 @@ GLIBCXX_3.4.28 {
 
_ZNSt12__shared_ptrINSt10filesystem28recursive_directory_iterator10_Dir_stackELN9__gnu_cxx12_Lock_policyE[012]EEC2EOS5_;
 
_ZNSt12__shared_ptrINSt10filesystem7__cxx1128recursive_directory_iterator10_Dir_stackELN9__gnu_cxx12_Lock_policyE[012]EEC2EOS6_;
 
+# std::__atomic_futex_unsigned_base::_M_futex_wait_until_steady
+_ZNSt28__atomic_futex_unsigned_base26_M_futex_wait_until_steady*;
+
 } GLIBCXX_3.4.27;
 
 # Symbols in the support library (libsupc++) have their own tag.
diff --git a/libstdc++-v3/include/bits/atomic_futex.h 
b/libstdc++-v3/include/bits/atomic_futex.h
index 36fda38..6d28d28 100644
--- a/libstdc++-v3/include/bits/atomic_futex.h
+++ b/libstdc++-v3/include/bits/atomic_futex.h
@@ -52,11 +52,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #if defined(_GLIBCXX_HAVE_LINUX_FUTEX) && ATOMIC_INT_LOCK_FREE > 1
   struct __atomic_futex_unsigned_base
   {
-// Returns false iff a timeout occurred.
+// __s and __ns are measured against CLOCK_REALTIME. Returns false
+// iff a timeout occurred.
 bool
 _M_futex_wait_until(unsigned *__addr, unsigned __val, bool __has_timeout,
chrono::seconds

[PATCH v4 2/7] libstdc++ futex: Use FUTEX_CLOCK_REALTIME for wait

2019-10-27 Thread Mike Crowe
The futex system call supports waiting for an absolute time if
FUTEX_WAIT_BITSET is used rather than FUTEX_WAIT.  Doing so provides two
benefits:

1. The call to gettimeofday is not required in order to calculate a
   relative timeout.

2. If someone changes the system clock during the wait then the futex
   timeout will correctly expire earlier or later.  Currently that only
   happens if the clock is changed prior to the call to gettimeofday.

According to futex(2), support for FUTEX_CLOCK_REALTIME was added in the
v2.6.28 Linux kernel and FUTEX_WAIT_BITSET was added in v2.6.25.  To ensure
that the code still works correctly with earlier kernel versions, an ENOSYS
error from futex[1] results in the futex_clock_realtime_unavailable flag
being set.  This flag is used to avoid the unnecessary unsupported futex
call in the future and to fall back to the previous gettimeofday and
relative time implementation.

glibc applied an equivalent switch in pthread_cond_timedwait to use
FUTEX_CLOCK_REALTIME and FUTEX_WAIT_BITSET rather than FUTEX_WAIT for
glibc-2.10 back in 2009.  See
glibc:cbd8aeb836c8061c23a5e00419e0fb25a34abee7

The futex_clock_realtime_unavailable flag is accessed using
std::memory_order_relaxed to stop it becoming a bottleneck.  If the first
two calls to _M_futex_wait_until happen to happen simultaneously then the
only consequence is that both will try to use FUTEX_CLOCK_REALTIME, both
risk discovering that it doesn't work and, if so, both set the flag.

[1] This is how glibc's nptl-init.c determines whether these flags are
supported.

* libstdc++-v3/src/c++11/futex.cc: Add new constants for required
futex flags.  Add futex_clock_realtime_unavailable flag to store
result of trying to use
FUTEX_CLOCK_REALTIME. 
(__atomic_futex_unsigned_base::_M_futex_wait_until):
Try to use FUTEX_WAIT_BITSET with FUTEX_CLOCK_REALTIME and only
fall back to using gettimeofday and FUTEX_WAIT if that's not
supported.
---
 libstdc++-v3/src/c++11/futex.cc | 37 ++-
 1 file changed, 37 insertions(+)

diff --git a/libstdc++-v3/src/c++11/futex.cc b/libstdc++-v3/src/c++11/futex.cc
index 3ea32f7..38ae448 100644
--- a/libstdc++-v3/src/c++11/futex.cc
+++ b/libstdc++-v3/src/c++11/futex.cc
@@ -35,8 +35,16 @@
 
 // Constants for the wait/wake futex syscall operations
 const unsigned futex_wait_op = 0;
+const unsigned futex_wait_bitset_op = 9;
+const unsigned futex_clock_realtime_flag = 256;
+const unsigned futex_bitset_match_any = ~0;
 const unsigned futex_wake_op = 1;
 
+namespace
+{
+  std::atomic futex_clock_realtime_unavailable;
+}
+
 namespace std _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
@@ -58,6 +66,35 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   }
 else
   {
+   if (!futex_clock_realtime_unavailable.load(std::memory_order_relaxed))
+ {
+   struct timespec rt;
+   rt.tv_sec = __s.count();
+   rt.tv_nsec = __ns.count();
+   if (syscall (SYS_futex, __addr,
+futex_wait_bitset_op | futex_clock_realtime_flag,
+__val, &rt, nullptr, futex_bitset_match_any) == -1)
+ {
+   __glibcxx_assert(errno == EINTR || errno == EAGAIN
+   || errno == ETIMEDOUT || errno == ENOSYS);
+   if (errno == ETIMEDOUT)
+ return false;
+   if (errno == ENOSYS)
+ {
+   futex_clock_realtime_unavailable.store(true,
+   std::memory_order_relaxed);
+   // Fall through to legacy implementation if the system
+   // call is unavailable.
+ }
+   else
+ return true;
+ }
+   else
+ return true;
+ }
+
+   // We only get to here if futex_clock_realtime_unavailable was
+   // true or has just been set to true.
struct timeval tv;
gettimeofday (&tv, NULL);
// Convert the absolute timeout value to a relative timeout
-- 
git-series 0.9.1


[PATCH v4 4/7] libstdc++ atomic_futex: Use std::chrono::steady_clock as reference clock

2019-10-27 Thread Mike Crowe
The user-visible effect of this change is that std::future::wait_for now
uses std::chrono::steady_clock to determine the timeout.  This makes it
immune to changes made to the system clock.  It also means that anyone
using their own clock types with std::future::wait_until will have the
timeout converted to std::chrono::steady_clock rather than
std::chrono::system_clock.

Now that use of both std::chrono::steady_clock and
std::chrono::system_clock are correctly supported for the wait timeout, I
believe that std::chrono::steady_clock is a better choice for the reference
clock that all other clocks are converted to since it is guaranteed to
advance steadily.  The previous behaviour of converting to
std::chrono::system_clock risks timeouts changing dramatically when the
system clock is changed.

* libstdc++-v3/include/bits/atomic_futex.h:
(__atomic_futex_unsigned): Change __clock_t typedef to use
steady_clock so that unknown clocks are synced to it rather than
system_clock. Change existing __clock_t overloads of
_M_load_and_text_until_impl and _M_load_when_equal_until to use
system_clock explicitly. Remove comment about DR 887 since these
changes address that problem as best as we currently able.
---
 libstdc++-v3/include/bits/atomic_futex.h | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/libstdc++-v3/include/bits/atomic_futex.h 
b/libstdc++-v3/include/bits/atomic_futex.h
index 6d28d28..387ee9d 100644
--- a/libstdc++-v3/include/bits/atomic_futex.h
+++ b/libstdc++-v3/include/bits/atomic_futex.h
@@ -71,7 +71,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template 
   class __atomic_futex_unsigned : __atomic_futex_unsigned_base
   {
-typedef chrono::system_clock __clock_t;
+typedef chrono::steady_clock __clock_t;
 
 // This must be lock-free and at offset 0.
 atomic _M_data;
@@ -169,7 +169,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 unsigned
 _M_load_and_test_until_impl(unsigned __assumed, unsigned __operand,
bool __equal, memory_order __mo,
-   const chrono::time_point<__clock_t, _Dur>& __atime)
+   const chrono::time_point& __atime)
 {
   auto __s = chrono::time_point_cast(__atime);
   auto __ns = chrono::duration_cast(__atime - __s);
@@ -229,7 +229,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _M_load_when_equal_until(unsigned __val, memory_order __mo,
  const chrono::time_point<_Clock, _Duration>& __atime)
   {
-   // DR 887 - Sync unknown clock to known clock.
const typename _Clock::time_point __c_entry = _Clock::now();
const __clock_t::time_point __s_entry = __clock_t::now();
const auto __delta = __atime - __c_entry;
@@ -241,7 +240,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 template
 _GLIBCXX_ALWAYS_INLINE bool
 _M_load_when_equal_until(unsigned __val, memory_order __mo,
-   const chrono::time_point<__clock_t, _Duration>& __atime)
+   const chrono::time_point& __atime)
 {
   unsigned __i = _M_load(__mo);
   if ((__i & ~_Waiter_bit) == __val)
-- 
git-series 0.9.1


[PATCH v4 1/7] libstdc++: Improve async test

2019-10-27 Thread Mike Crowe
Add tests for waiting for the future using both std::chrono::steady_clock
and std::chrono::system_clock in preparation for dealing with those clocks
properly in futex.cc.

 * libstdc++-v3/testsuite/30_threads/async/async.cc (test02): Test
 steady_clock with std::future::wait_until.  (test03): Add new test
 templated on clock type waiting for future associated with async
 to resolve.  (main): Call test03 to test both system_clock and
 steady_clock.
---
 libstdc++-v3/testsuite/30_threads/async/async.cc | 33 +-
 1 file changed, 33 insertions(+)

diff --git a/libstdc++-v3/testsuite/30_threads/async/async.cc 
b/libstdc++-v3/testsuite/30_threads/async/async.cc
index 851925b..1b6e2ff 100644
--- a/libstdc++-v3/testsuite/30_threads/async/async.cc
+++ b/libstdc++-v3/testsuite/30_threads/async/async.cc
@@ -51,17 +51,50 @@ void test02()
   VERIFY( status == std::future_status::timeout );
   status = f1.wait_until(std::chrono::system_clock::now());
   VERIFY( status == std::future_status::timeout );
+  status = f1.wait_until(std::chrono::steady_clock::now());
+  VERIFY( status == std::future_status::timeout );
   l.unlock();  // allow async thread to proceed
   f1.wait();   // wait for it to finish
   status = f1.wait_for(std::chrono::milliseconds(0));
   VERIFY( status == std::future_status::ready );
   status = f1.wait_until(std::chrono::system_clock::now());
   VERIFY( status == std::future_status::ready );
+  status = f1.wait_until(std::chrono::steady_clock::now());
+  VERIFY( status == std::future_status::ready );
+}
+
+// This test is prone to failures if run on a loaded machine where the
+// kernel decides not to schedule us for several seconds. It also
+// assumes that no-one will warp CLOCK whilst the test is
+// running when CLOCK is std::chrono::system_clock.
+template
+void test03()
+{
+  auto const start = CLOCK::now();
+  future f1 = async(launch::async, []() {
+  std::this_thread::sleep_for(std::chrono::seconds(2));
+});
+  std::future_status status;
+
+  status = f1.wait_for(std::chrono::milliseconds(500));
+  VERIFY( status == std::future_status::timeout );
+
+  status = f1.wait_until(start + std::chrono::seconds(1));
+  VERIFY( status == std::future_status::timeout );
+
+  status = f1.wait_until(start + std::chrono::seconds(5));
+  VERIFY( status == std::future_status::ready );
+
+  auto const elapsed = CLOCK::now() - start;
+  VERIFY( elapsed >= std::chrono::seconds(2) );
+  VERIFY( elapsed < std::chrono::seconds(5) );
 }
 
 int main()
 {
   test01();
   test02();
+  test03();
+  test03();
   return 0;
 }
-- 
git-series 0.9.1


[PATCH v4 0/7] std::future::wait_* improvements

2019-10-27 Thread Mike Crowe
v1 of this series was originally posted back in September 2017 (see
https://gcc.gnu.org/ml/libstdc++/2017-09/msg00083.html )

v2 of this series was originally posted back in January 2018 (see
https://gcc.gnu.org/ml/libstdc++/2018-01/msg00035.html )

v3 of this series was originally posted back in August 2018 (see
https://gcc.gnu.org/ml/libstdc++/2018-08/msg1.html )

Changes since v3:

* Update libstdc++-v3/config/abi/pre/gnu.ver as recommended by
  Jonathan Wakely in
  https://gcc.gnu.org/ml/libstdc++/2018-08/msg00015.html .

* Replace _GLIBCXX_DEBUG_ASSERT with __glibcxx_assert in futex.cc as
  recommended by Jonathan Wakely in
  https://gcc.gnu.org/ml/libstdc++/2018-08/msg00014.html .

* Incorporate extra patch originally posted separately in
  https://gcc.gnu.org/ml/libstdc++/2018-08/msg00017.html .

* Rename incorrect futex_clock_realtime_unavailable to
  futex_clock_monotonic_unavailable in comment.  Spotted by Jonathan
  Wakely.

* Improve commit messages to use GNU two-spaces-after-periods style
  and add changelogs.

* Restrict lines to eighty columns or less.

* Remove unnecessarily-added steady_clock_copy from test - slow_clock
  works just as well.

* A few other minor tweaks to tests and comments.

Combined ChangeLog entry (generated from the separate messages in each
commit):

2019-10-27  Mike Crowe  

* libstdc++-v3/testsuite/30_threads/async/async.cc (test02):
Test steady_clock with std::future::wait_until.  (test03): Add
new test templated on clock type waiting for future associated
with async to resolve.  (main): Call test03 to test both
system_clock and steady_clock.

* libstdc++-v3/src/c++11/futex.cc: Add new constants for
required futex flags.  Add futex_clock_realtime_unavailable
flag to store result of trying to use FUTEX_CLOCK_REALTIME.
(__atomic_futex_unsigned_base::_M_futex_wait_until): Try to
use FUTEX_WAIT_BITSET with FUTEX_CLOCK_REALTIME and only fall
back to using gettimeofday and FUTEX_WAIT if that's not
supported.

* libstdc++-v3/config/abi/pre/gnu.ver: Update for addition of
__atomic_futex_unsigned_base::_M_futex_wait_until_steady.
* libstdc++-v3/include/bits/atomic_futex.h
(__atomic_futex_unsigned_base): Add comments to clarify that
_M_futex_wait_until _M_load_and_test_until use CLOCK_REALTIME.
Declare new _M_futex_wait_until_steady and
_M_load_and_text_until_steady methods that use
CLOCK_MONOTONIC.  Add _M_load_and_test_until_impl and
_M_load_when_equal_until overloads that accept a steady_clock
time_point and use these new methods.
* libstdc++-v3/src/c++11/futex.cc: Include headers required
for clock_gettime. Add futex_clock_monotonic_flag constant to
tell futex to use CLOCK_MONOTONIC to match the existing
futex_clock_realtime_flag.  Add
futex_clock_monotonic_unavailable to store the result of
trying to use CLOCK_MONOTONIC.
(__atomic_futex_unsigned_base::_M_futex_wait_until_steady):
Add new variant of _M_futex_wait_until that uses
CLOCK_MONOTONIC to support waiting using steady_clock.

* libstdc++-v3/include/bits/atomic_futex.h:
(__atomic_futex_unsigned): Change __clock_t typedef to use
steady_clock so that unknown clocks are synced to it rather
than system_clock. Change existing __clock_t overloads of
_M_load_and_text_until_impl and _M_load_when_equal_until to
use system_clock explicitly. Remove comment about DR 887 since
these changes address that problem as best as we currently
able.

* libstdc++-v3/include/bits/atomic_futex.h:
(__atomic_futex_unsigned) Add loop to _M_load_when_equal_until
on generic _Clock to check the timeout against _Clock again
after _M_load_when_equal_until returns indicating a timeout.
* libstdc++-v3/testsuite/30_threads/async/async.cc: Invent
slow_clock that runs at an eleventh of steady_clock's
speed. Use it to test the user-supplied-clock variant of
__atomic_futex_unsigned::_M_load_when_equal_until works
generally with test03 and loops correctly when the timeout
time hasn't been reached in test04.

* libstdc++-v3/include/bits/atomic_futex.h:
(__atomic_futex_unsigned::_M_load_when_equal_for): Round up
timeout if required after conversion to reference clock
duration.
* libstdc++-v3/testsuite/30_threads/async/async.cc:
(test_pr68519): New test for the equivalent of PR
libstdc++/68519.

Mike Crowe (7):
  libstdc++: Improve async test
  libstdc++ futex: Use FUTEX_CLOCK_REALTIME for wait
  libstdc++ futex: Support waiting on std::chrono::steady_clock directly
  libstdc++ atomic_futex: Use std::chrono::steady_clock as reference clock
  libstdc++ futex: Loop when waiting against arbitrary clock
  libstdc++ a

[PATCH v4 7/7] libstdc++: Extra async tests, not for merging

2019-10-27 Thread Mike Crowe
These tests show that changing the system clock has an effect on
std::future::wait_until when using std::chrono::system_clock but not when
using std::chrono::steady_clock.  Unfortunately these tests have a number
of downsides:

1. Nothing that is attempting to keep the clock set correctly (ntpd,
   systemd-timesyncd) can be running at the same time.

2. The test process requires the CAP_SYS_TIME capability (although, as it's
   written it checks for being root.)

3. Other processes running concurrently may misbehave when the clock darts
   back and forth.

4. They are slow to run.

As such, I don't think they are suitable for merging. I include them here
because I wanted to document how I had tested the changes in the previous
commits.
---
 libstdc++-v3/testsuite/30_threads/async/async.cc | 70 +-
 1 file changed, 70 insertions(+)

diff --git a/libstdc++-v3/testsuite/30_threads/async/async.cc 
b/libstdc++-v3/testsuite/30_threads/async/async.cc
index ddd007d..979f5b4 100644
--- a/libstdc++-v3/testsuite/30_threads/async/async.cc
+++ b/libstdc++-v3/testsuite/30_threads/async/async.cc
@@ -24,6 +24,7 @@
 
 #include 
 #include 
+#include 
 
 using namespace std;
 
@@ -154,6 +155,71 @@ void test_pr68519()
   VERIFY( elapsed_steady >= std::chrono::seconds(1) );
 }
 
+void perturb_system_clock(const std::chrono::seconds &seconds)
+{
+  struct timeval tv;
+  if (gettimeofday(&tv, NULL))
+abort();
+
+  tv.tv_sec += seconds.count();
+  if (settimeofday(&tv, NULL))
+abort();
+}
+
+// Ensure that advancing CLOCK_REALTIME doesn't make any difference
+// when we're waiting on std::chrono::steady_clock.
+void test05()
+{
+  auto const start = chrono::steady_clock::now();
+  future f1 = async(launch::async, []() {
+  std::this_thread::sleep_for(std::chrono::seconds(10));
+});
+
+  perturb_system_clock(chrono::seconds(20));
+
+  std::future_status status;
+  status = f1.wait_for(std::chrono::seconds(4));
+  VERIFY( status == std::future_status::timeout );
+
+  status = f1.wait_until(start + std::chrono::seconds(6));
+  VERIFY( status == std::future_status::timeout );
+
+  status = f1.wait_until(start + std::chrono::seconds(12));
+  VERIFY( status == std::future_status::ready );
+
+  auto const elapsed = chrono::steady_clock::now() - start;
+  VERIFY( elapsed >= std::chrono::seconds(10) );
+  VERIFY( elapsed < std::chrono::seconds(15) );
+
+  perturb_system_clock(chrono::seconds(-20));
+}
+
+// Ensure that advancing CLOCK_REALTIME does make a difference when
+// we're waiting on std::chrono::system_clock.
+void test06()
+{
+  auto const start = chrono::system_clock::now();
+  auto const start_steady = chrono::steady_clock::now();
+
+  future f1 = async(launch::async, []() {
+  std::this_thread::sleep_for(std::chrono::seconds(5));
+  perturb_system_clock(chrono::seconds(60));
+  std::this_thread::sleep_for(std::chrono::seconds(5));
+});
+  future_status status;
+  status = f1.wait_until(start + std::chrono::seconds(60));
+  VERIFY( status == std::future_status::timeout );
+
+  auto const elapsed_steady = chrono::steady_clock::now() - start_steady;
+  VERIFY( elapsed_steady >= std::chrono::seconds(5) );
+  VERIFY( elapsed_steady < std::chrono::seconds(10) );
+
+  status = f1.wait_until(start + std::chrono::seconds(75));
+  VERIFY( status == std::future_status::ready );
+
+  perturb_system_clock(chrono::seconds(-60));
+}
+
 int main()
 {
   test01();
@@ -163,5 +229,9 @@ int main()
   test03();
   test04();
   test_pr68519();
+  if (geteuid() == 0) {
+test05();
+test06();
+  }
   return 0;
 }
-- 
git-series 0.9.1


[PATCH v4 6/7] libstdc++ atomic_futex: Avoid rounding errors in std::future::wait_for

2019-10-27 Thread Mike Crowe
Convert the specified duration to the target clock's duration type before
adding it to the current time.  This avoids the situation described in PR
libstdc++/68519 when the specified duration type lacks sufficient
precision.

* libstdc++-v3/include/bits/atomic_futex.h:
(__atomic_futex_unsigned::_M_load_when_equal_for): Round up timeout
if required after conversion to reference clock duration.

* libstdc++-v3/testsuite/30_threads/async/async.cc: (test_pr68519):
  New test for the equivalent of PR libstdc++/68519.
---
 libstdc++-v3/include/bits/atomic_futex.h |  6 +-
 libstdc++-v3/testsuite/30_threads/async/async.cc | 15 +++
 2 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/include/bits/atomic_futex.h 
b/libstdc++-v3/include/bits/atomic_futex.h
index b04e806..ec823eb 100644
--- a/libstdc++-v3/include/bits/atomic_futex.h
+++ b/libstdc++-v3/include/bits/atomic_futex.h
@@ -219,8 +219,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _M_load_when_equal_for(unsigned __val, memory_order __mo,
  const chrono::duration<_Rep, _Period>& __rtime)
   {
+   using __dur = typename __clock_t::duration;
+   auto __reltime = chrono::duration_cast<__dur>(__rtime);
+   if (__reltime < __rtime)
+ ++__reltime;
return _M_load_when_equal_until(__val, __mo,
-   __clock_t::now() + __rtime);
+   __clock_t::now() + __reltime);
   }
 
 // Returns false iff a timeout occurred.
diff --git a/libstdc++-v3/testsuite/30_threads/async/async.cc 
b/libstdc++-v3/testsuite/30_threads/async/async.cc
index 7700913..ddd007d 100644
--- a/libstdc++-v3/testsuite/30_threads/async/async.cc
+++ b/libstdc++-v3/testsuite/30_threads/async/async.cc
@@ -140,6 +140,20 @@ void test04()
   }
 }
 
+void test_pr68519()
+{
+  future f1 = async(launch::async, []() {
+  std::this_thread::sleep_for(std::chrono::seconds(1));
+});
+
+  std::chrono::duration const wait_time = std::chrono::seconds(1);
+  auto const start_steady = chrono::steady_clock::now();
+  auto status = f1.wait_for(wait_time);
+  auto const elapsed_steady = chrono::steady_clock::now() - start_steady;
+
+  VERIFY( elapsed_steady >= std::chrono::seconds(1) );
+}
+
 int main()
 {
   test01();
@@ -148,5 +162,6 @@ int main()
   test03();
   test03();
   test04();
+  test_pr68519();
   return 0;
 }
-- 
git-series 0.9.1


[PATCH v4 5/7] libstdc++ futex: Loop when waiting against arbitrary clock

2019-10-27 Thread Mike Crowe
If std::future::wait_until is passed a time point measured against a clock
that is neither std::chrono::steady_clock nor std::chrono::system_clock
then the generic implementation of
__atomic_futex_unsigned::_M_load_when_equal_until is called which
calculates the timeout based on __clock_t and calls the
_M_load_when_equal_until method for that clock to perform the actual wait.

There's no guarantee that __clock_t is running at the same speed as the
caller's clock, so if the underlying wait times out timeout we need to
check the timeout against the caller's clock again before potentially
looping.

Also add two extra tests to the testsuite's async.cc:

* run test03 with steady_clock_copy, which behaves identically to
  std::chrono::steady_clock, but isn't std::chrono::steady_clock. This
  causes the overload of __atomic_futex_unsigned::_M_load_when_equal_until
  that takes an arbitrary clock to be called.

* invent test04 which uses a deliberately slow running clock in order to
  exercise the looping behaviour o
  __atomic_futex_unsigned::_M_load_when_equal_until described above.

* libstdc++-v3/include/bits/atomic_futex.h:
(__atomic_futex_unsigned) Add loop to _M_load_when_equal_until on
generic _Clock to check the timeout against _Clock again after
_M_load_when_equal_until returns indicating a timeout.

* libstdc++-v3/testsuite/30_threads/async/async.cc: Invent
slow_clock that runs at an eleventh of steady_clock's speed. Use it
to test the user-supplied-clock variant of
__atomic_futex_unsigned::_M_load_when_equal_until works generally
with test03 and loops correctly when the timeout time hasn't been
reached in test04.
---
 libstdc++-v3/include/bits/atomic_futex.h | 15 +++--
 libstdc++-v3/testsuite/30_threads/async/async.cc | 52 +-
 2 files changed, 62 insertions(+), 5 deletions(-)

diff --git a/libstdc++-v3/include/bits/atomic_futex.h 
b/libstdc++-v3/include/bits/atomic_futex.h
index 387ee9d..b04e806 100644
--- a/libstdc++-v3/include/bits/atomic_futex.h
+++ b/libstdc++-v3/include/bits/atomic_futex.h
@@ -229,11 +229,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _M_load_when_equal_until(unsigned __val, memory_order __mo,
  const chrono::time_point<_Clock, _Duration>& __atime)
   {
-   const typename _Clock::time_point __c_entry = _Clock::now();
-   const __clock_t::time_point __s_entry = __clock_t::now();
-   const auto __delta = __atime - __c_entry;
-   const auto __s_atime = __s_entry + __delta;
-   return _M_load_when_equal_until(__val, __mo, __s_atime);
+   typename _Clock::time_point __c_entry = _Clock::now();
+   do {
+ const __clock_t::time_point __s_entry = __clock_t::now();
+ const auto __delta = __atime - __c_entry;
+ const auto __s_atime = __s_entry + __delta;
+ if (_M_load_when_equal_until(__val, __mo, __s_atime))
+   return true;
+ __c_entry = _Clock::now();
+   } while (__c_entry < __atime);
+   return false;
   }
 
 // Returns false iff a timeout occurred.
diff --git a/libstdc++-v3/testsuite/30_threads/async/async.cc 
b/libstdc++-v3/testsuite/30_threads/async/async.cc
index 1b6e2ff..7700913 100644
--- a/libstdc++-v3/testsuite/30_threads/async/async.cc
+++ b/libstdc++-v3/testsuite/30_threads/async/async.cc
@@ -90,11 +90,63 @@ void test03()
   VERIFY( elapsed < std::chrono::seconds(5) );
 }
 
+// This clock is supposed to run at a tenth of normal speed, but we
+// don't have to worry about rounding errors causing us to wake up
+// slightly too early below if we actually run it at an eleventh of
+// normal speed. It is used to exercise the
+// __atomic_futex_unsigned::_M_load_when_equal_until overload that
+// takes an arbitrary clock.
+struct slow_clock
+{
+  using rep = std::chrono::steady_clock::rep;
+  using period = std::chrono::steady_clock::period;
+  using duration = std::chrono::steady_clock::duration;
+  using time_point = std::chrono::time_point;
+  static constexpr bool is_steady = true;
+
+  static time_point now()
+  {
+auto steady = std::chrono::steady_clock::now();
+return time_point{steady.time_since_epoch() / 11};
+  }
+};
+
+void test04()
+{
+  using namespace std::chrono;
+
+  auto const slow_start = slow_clock::now();
+  future f1 = async(launch::async, []() {
+  std::this_thread::sleep_for(std::chrono::seconds(2));
+});
+
+  // Wait for ~1s
+  {
+auto const steady_begin = steady_clock::now();
+auto const status = f1.wait_until(slow_start + milliseconds(100));
+VERIFY(status == std::future_status::timeout);
+auto const elapsed = steady_clock::now() - steady_begin;
+VERIFY(elapsed >= seconds(1));
+VERIFY(elapsed < seconds(2));
+  }
+
+  // Wait for up to ~2s more
+  {
+auto const steady_begin = steady_clock::now();
+auto const status = f1.wait_until(slow_start + milliseconds(300));
+VERIFY(status == std::future_status::rea

Re: [PATCH][MSP430] Use hardware multiply routine to perform HImode widening multiplication (mulhisi3)

2019-10-27 Thread Jeff Law
On 10/23/19 11:37 AM, Jozef Lawrynowicz wrote:
> For MSP430 in some configurations, GCC will generate code for mulhisi3 by
> inserting instructions to widen each 16-bit operand before calling a library
> routine for mulsi3.
> However, there exists a hardware multiply routine to perform this widening
> multiplication, but it is only made use of at -O3 where it is inserted
> inline into program.
> 
> We can reduce code size and improve performance by always calling the mspabi
> helper function to perform this widening multiplication when hardware multiply
> is available. 
> 
> I also benchmarked the effect of using a library function for mulsidi3
> but it resulted in slower execution both with and without hardware multiply
> support. It also increased code size for a variety of programs.
> 
> Successfully regtested on trunk.
> 
> Additionally regtested msp430.exp at -O1, -O2, -O3 and -Os.
> There are tests which check each supported hardware multiply option
> executes correctly, so running at these optimization levels verifies the 
> changes
> in this patch.
> 
> Ok for trunk?
> 
> 
> 0001-MSP430-Use-mspabi-helper-function-to-perform-HImode-.patch
> 
> From 695ae0e560396034bc1fc2e9d9e601ab7b3d901b Mon Sep 17 00:00:00 2001
> From: Jozef Lawrynowicz 
> Date: Wed, 23 Oct 2019 13:19:45 +0100
> Subject: [PATCH] MSP430: Use mspabi helper function to perform HImode widening
>  multiplication
> 
> gcc/ChangeLog:
> 
> 2019-10-23  Jozef Lawrynowicz  
> 
>   * config/msp430/msp430.c (msp430_expand_helper): Support expansion of
>   calls to __mspabi_mpy* functions.
>   * config/msp430/msp430.md (mulhisi3): New define_expand.
>   (umulhisi3): New define_expand.
>   (*mulhisi3_inline): Use old mulhisi3 define_insn.
>   (*umulhisi3_inline): Use old umulhisi3 define_insn.
OK
jeff



Re: [PATCH] Move jump threading before reload

2019-10-27 Thread Jeff Law
On 10/18/19 3:06 AM, Ilya Leoshkevich wrote:
> Bootstrapped and regtested on x86_64-redhat-linux, s390x-redhat-linux
> and ppc64le-redhat-linux.  The offending patch is in gcc-9_1_0-release
> and gcc-9_2_0-release - do I need to backport this fix to gcc-9-branch?
> 
> 
> r266734 has introduced a new instance of jump threading pass in order to
> take advantage of opportunities that combine opens up.  It was perceived
> back then that it was beneficial to delay it after reload, since that
> might produce even more such opportunities.
> 
> Unfortunately jump threading interferes with hot/cold partitioning.  In
> the code from PR92007, it converts the following
> 
>   +-- 2/HOT +
>   | |
>   v v
> 3/HOT --> 5/HOT --> 8/HOT --> 11/COLD --> 6/HOT --EH--> 16/HOT
> |   ^
> |   |
> +---+
> 
> into the following:
> 
>   +-- 2/HOT --+
>   |   |
>   v   v
> 3/HOT --> 8/HOT --> 11/COLD --> 6/COLD --EH--> 16/HOT
> 
> This makes hot bb 6 dominated by cold bb 11, and because of this
> fixup_partitions makes bb 6 cold as well, which in turn makes EH edge
> 6->16 a crossing one.  Not only can't we have crossing EH edges, we are
> also not allowed to introduce new crossing edges after reload in
> general, since it might require extra registers on some targets.
> 
> Therefore, move the jump threading pass between combine and hot/cold
> partitioning.  Building SPEC 2006 and SPEC 2017 with the old and the new
> code indicates that:
> 
> * When doing jump threading right after reload, 3889 edges are threaded.
> * When doing jump threading right after combine, 3918 edges are
>   threaded.
> 
> This means this change will not introduce performance regressions.
> 
> gcc/ChangeLog:
> 
> 2019-10-17  Ilya Leoshkevich  
> 
>   PR rtl-optimization/92007
>   * cfgcleanup.c (thread_jump): Add an assertion that we don't
>   call it after reload if hot/cold partitioning has been done.
>   (class pass_postreload_jump): Rename to
>   pass_jump_after_combine.
>   (make_pass_postreload_jump): Rename to
>   make_pass_jump_after_combine.
>   * passes.def(pass_postreload_jump): Move before reload, rename
>   to pass_jump_after_combine.
>   * tree-pass.h (make_pass_postreload_jump): Rename to
>   make_pass_jump_after_combine.
> 
> gcc/testsuite/ChangeLog:
> 
> 2019-10-17  Ilya Leoshkevich  
> 
>   PR rtl-optimization/92007
>   * g++.dg/opt/pr92007.C: New test (from Arseny Solokha).
OK for trunk and gcc-9 branch.

jeff



Re: [PATCH 9/9] ifcvt: Also pass reversed cc comparison.

2019-10-27 Thread Richard Sandiford
Robin Dapp  writes:
> When then and else are reversed, we would swap new_val and old_val.
> The same has to be done for our new code paths.
> Also, emit_conditional_move may perform swapping.  In case we need to
> swap, the cc comparison also needs to be swapped and for this we pass
> the reversed cc comparison directly.  An alternative would be to pass
> the constituents of the compare directly, but the functions have more
> than enough parameters already.

The changes to emit_conditional_move shouldn't be needed if we pass
the condition directly to a new overload, as per the 6/9 comments.
Let me know if that's not true.

> ---
>  gcc/ifcvt.c  | 37 ++---
>  gcc/optabs.c |  4 +++-
>  gcc/optabs.h |  2 +-
>  3 files changed, 30 insertions(+), 13 deletions(-)
>
> diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
> index 09bf443656c..d3959b870f6 100644
> --- a/gcc/ifcvt.c
> +++ b/gcc/ifcvt.c
> @@ -3313,7 +3315,7 @@ noce_convert_multiple_sets (struct noce_if_info 
> *if_info)
>*/
>  
>rtx_insn *seq, *seq1 = 0, *seq2 = 0;
> -  rtx temp_dest, temp_dest1, temp_dest2;
> +  rtx temp_dest, temp_dest1 = NULL_RTX, temp_dest2 = NULL_RTX;
>unsigned int cost1 = 0, cost2 = 0;
>bool speed_p = if_info->speed_p;
>push_topmost_sequence ();
> @@ -3324,7 +3326,13 @@ noce_convert_multiple_sets (struct noce_if_info 
> *if_info)
> temp_dest1 = noce_emit_cmove (if_info, temp, cond_code,
> x, y, new_val, old_val, NULL_RTX);
>   else
> -   noce_emit_move_insn (target, new_val);
> +   {
> + temp_dest1 = target;
> + if (if_info->then_else_reversed)
> +   noce_emit_move_insn (target, old_val);
> + else
> +   noce_emit_move_insn (target, new_val);
> +   }
>  
>   /* If we failed to expand the conditional move, drop out and don't
>  try to continue.  */
> @@ -3352,9 +3360,15 @@ noce_convert_multiple_sets (struct noce_if_info 
> *if_info)
>   start_sequence ();
>   if (!need_no_cmovs.get (insn))
> temp_dest2 = noce_emit_cmove (if_info, target, cond_code,
> -   x, y, new_val, old_val, cc_cmp);
> +   x, y, new_val, old_val, cc_cmp, rev_cc_cmp);
>   else
> -   noce_emit_move_insn (target, new_val);
> +   {
> + temp_dest2 = target;
> + if (if_info->then_else_reversed)
> +   noce_emit_move_insn (target, old_val);
> + else
> +   noce_emit_move_insn (target, new_val);
> +   }
>  
>   /* If we failed to expand the conditional move, drop out and don't
>  try to continue.  */
> @@ -3386,7 +3400,8 @@ noce_convert_multiple_sets (struct noce_if_info 
> *if_info)
> /* We do not require a temp register, so remove it from the list.  */
> seq = seq2;
> temp_dest = temp_dest2;
> -   temps_created.remove (temp);
> +   if (temps_created.get (temp))
> + temps_created.remove (temp);
>   }
>else
>   {
> @@ -3436,7 +3451,8 @@ noce_convert_multiple_sets (struct noce_if_info 
> *if_info)
>/* Mark all our temporaries and targets as used.  */
>for (int i = 0; i < count; i++)
>  {
> -  set_used_flags (temporaries[i]);
> +  if (temps_created.get(targets[i]))
> + set_used_flags (temporaries[i]);
>set_used_flags (targets[i]);
>  }
>  
> @@ -3941,7 +3957,7 @@ check_need_temps (basic_block bb, hash_map 
> *need_temp, rtx cond)
>  
>FOR_BB_INSNS (bb, insn)
>  {
> -  rtx set, src, dest;
> +  rtx set, dest;
>  
>if (!active_insn_p (insn))
>   continue;
> @@ -3950,7 +3966,6 @@ check_need_temps (basic_block bb, hash_map 
> *need_temp, rtx cond)
>if (set == NULL_RTX)
>   continue;
>  
> -  src = SET_SRC (set);
>dest = SET_DEST (set);
>  
>/* noce_emit_cmove will emit the condition check every time it is 
> called

It looks like these changes are really fixes for previous patches in the
series.  It'd be clearer to fold them into the relevant previous patch
if possible.

Thanks,
Richard


Re: [PATCH 7/9] ifcvt: Emit two cmov variants and choose the less expensive one.

2019-10-27 Thread Richard Sandiford
Robin Dapp  writes:
> This patch duplicates the previous noce_emit_cmove logic.  First it
> passes the canonical comparison emits the sequence and costs it.
> Then, a second, separate sequence is created by passing the cc compare
> we extracted before.  The costs of both sequences are compared and the
> cheaper one is emitted.
> ---
>  gcc/ifcvt.c | 106 +++-
>  1 file changed, 97 insertions(+), 9 deletions(-)
>
> diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
> index 3db707e1fd1..955f9541f60 100644
> --- a/gcc/ifcvt.c
> +++ b/gcc/ifcvt.c
> @@ -3208,8 +3208,11 @@ noce_convert_multiple_sets (struct noce_if_info 
> *if_info)
>  
>hash_map need_temps;
>  
> -  if (!cc_cmp)
> -check_need_temps (then_bb, &need_temps, cond);
> +  /* If we newly set a CC before a cmov, we might need a temporary
> + even though the compare will be removed by a later pass.  Costing
> + of a sequence with these will be inaccurate.  */
> +
> +  check_need_temps (then_bb, &need_temps, cond);
>  
>hash_map temps_created;
>  
> @@ -3298,18 +3301,103 @@ noce_convert_multiple_sets (struct noce_if_info 
> *if_info)
> old_val = lowpart_subreg (dst_mode, old_val, src_mode);
>   }
>  
> -  /* Actually emit the conditional move.  */
> -  rtx temp_dest = noce_emit_cmove (if_info, temp, cond_code,
> -x, y, new_val, old_val, cc_cmp);
> +  /* Try emitting a conditional move passing the backend the
> +  canonicalized comparison.  This can e.g. be used to recognize
> +  expressions like
> +
> +if (a < b)
> +{
> +  ...
> +  b = a;
> +}
> +
> +  A backend can recognize this and emit a min/max insn.
> +  We will still emit a superfluous CC comparison before the
> +  min/max, though, which complicates costing.
> +  */
> +
> +  rtx_insn *seq, *seq1 = 0, *seq2 = 0;
> +  rtx temp_dest, temp_dest1, temp_dest2;
> +  unsigned int cost1 = 0, cost2 = 0;
> +  bool speed_p = if_info->speed_p;
> +  push_topmost_sequence ();

Why do we need the push_topmost_sequence/pop_topmost_sequence?
start_sequence/end_sequence ought to be enough.

> +
> +  {
> + start_sequence ();
> + temp_dest1 = noce_emit_cmove (if_info, temp, cond_code,
> + x, y, new_val, old_val, NULL_RTX);

Nit: arguments should be indented below "if_info, ".

> +
> + /* If we failed to expand the conditional move, drop out and don't
> +try to continue.  */
> + if (temp_dest1 == NULL_RTX)
> +   {
> + seq1 = NULL;
> + cost1 = COSTS_N_INSNS (100);
> +   }
> + else
> +   {
> + seq1 = get_insns ();
> + cost1 = seq_cost (seq1, speed_p);
> + rtx_insn *ii;
> + for (ii = seq1; ii; ii = NEXT_INSN (ii))
> +   if (recog_memozied (ii) == -1)
> + cost1 += 1;
> +   }
> + end_sequence ();

Don't think we should use fake costs for failure.  Just setting seq1 to
null should be enough, and should be OK for the recog_memoized case too.
Can break the loop as soon as we find a failure.

> +  }
> +
> + /* Now try emitting one passing a non-canonicalized cc comparison.
> +This allows the backend to emit a cmov directly without additional
> +compare.  */
> +  {
> + start_sequence ();
> + temp_dest2 = noce_emit_cmove (if_info, target, cond_code,
> + x, y, new_val, old_val, cc_cmp);
> +
> + /* If we failed to expand the conditional move, drop out and don't
> +try to continue.  */
> + if (temp_dest2 == NULL_RTX)
> +   {
> + seq2 = NULL;
> + cost2 = COSTS_N_INSNS (100);
> +   }
> + else
> +   {
> + seq2 = get_insns ();
> + cost2 = seq_cost (seq2, speed_p);
> + rtx_insn *ii;
> + for (ii = seq2; ii; ii = NEXT_INSN (ii))
> +   if (recog_memozied (ii) == -1)
> + cost2 += 1;
> +   }
> + end_sequence ();
> +  }

Same comments here.  Would be nice to split this out into a subroutine
rather than duplicate it.

Thanks,
Richard

>  
> -  /* If we failed to expand the conditional move, drop out and don't
> -  try to continue.  */
> -  if (temp_dest == NULL_RTX)
> +  /* Check which version is less expensive.  */
> +  if (cost1 <= cost2 && seq1 != NULL_RTX)
> + {
> +   temp_dest = temp_dest1;
> +   seq = seq1;
> + }
> +  else if (seq2 != NULL_RTX)
> + {
> +   /* We do not require a temp register, so remove it from the list.  */
> +   seq = seq2;
> +   temp_dest = temp_dest2;
> +   temps_created.remove (temp);
> + }
> +  else
>   {
> +   /* Nothing worked, bail out.  */
> +   pop_topmost_sequence ();
> end_sequence ();
> -   return FALSE;
> +   return false;
>   }
>  
> +  /* End the sub sequence and emit to the main sequence.  */
> +  pop_topmost_sequenc

Re: [PATCH 6/9] ifcvt: Extract cc comparison from jump.

2019-10-27 Thread Richard Sandiford
Robin Dapp  writes:
> This patch extracts a cc comparison from the initial compare/jump
> insn and allows it to be passed to noce_emit_cmove and
> emit_conditional_move.
> ---
>  gcc/ifcvt.c  | 68 
>  gcc/optabs.c |  7 --
>  gcc/optabs.h |  2 +-
>  3 files changed, 69 insertions(+), 8 deletions(-)
>
> diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
> index 99716e5f63c..3db707e1fd1 100644
> --- a/gcc/ifcvt.c
> +++ b/gcc/ifcvt.c
> @@ -86,6 +86,7 @@ static rtx_insn *find_active_insn_after (basic_block, 
> rtx_insn *);
>  static basic_block block_fallthru (basic_block);
>  static rtx cond_exec_get_condition (rtx_insn *);
>  static rtx noce_get_condition (rtx_insn *, rtx_insn **, bool);
> +static rtx noce_get_compare (rtx_insn *, bool);
>  static int noce_operand_ok (const_rtx);
>  static void merge_if_block (ce_if_block *);
>  static int find_cond_trap (basic_block, edge, edge);
> @@ -775,7 +776,7 @@ static int noce_try_addcc (struct noce_if_info *);
>  static int noce_try_store_flag_constants (struct noce_if_info *);
>  static int noce_try_store_flag_mask (struct noce_if_info *);
>  static rtx noce_emit_cmove (struct noce_if_info *, rtx, enum rtx_code, rtx,
> - rtx, rtx, rtx);
> + rtx, rtx, rtx, rtx = NULL);
>  static int noce_try_cmove (struct noce_if_info *);
>  static int noce_try_cmove_arith (struct noce_if_info *);
>  static rtx noce_get_alt_condition (struct noce_if_info *, rtx, rtx_insn **);
> @@ -1658,7 +1659,7 @@ noce_try_store_flag_mask (struct noce_if_info *if_info)
>  
>  static rtx
>  noce_emit_cmove (struct noce_if_info *if_info, rtx x, enum rtx_code code,
> -  rtx cmp_a, rtx cmp_b, rtx vfalse, rtx vtrue)
> +  rtx cmp_a, rtx cmp_b, rtx vfalse, rtx vtrue, rtx cc_cmp)
>  {
>rtx target ATTRIBUTE_UNUSED;
>int unsignedp ATTRIBUTE_UNUSED;
> @@ -1706,7 +1707,7 @@ noce_emit_cmove (struct noce_if_info *if_info, rtx x, 
> enum rtx_code code,
>  
>target = emit_conditional_move (x, code, cmp_a, cmp_b, VOIDmode,
> vtrue, vfalse, GET_MODE (x),
> -   unsignedp);
> +   unsignedp, cc_cmp);
>if (target)
>  return target;
>  
> @@ -2970,6 +2971,60 @@ noce_get_condition (rtx_insn *jump, rtx_insn 
> **earliest, bool then_else_reversed
>return tmp;
>  }
>  
> +/* Get the comparison from the insn.  */
> +static rtx
> +noce_get_compare (rtx_insn *jump, bool then_else_reversed)
> +{
> +  enum rtx_code code;
> +  const_rtx set;
> +  rtx tem;
> +  rtx op0, op1;
> +
> +  if (!have_cbranchcc4)
> +return 0;

Why do we need to check this?

> +
> +  if (! any_condjump_p (jump))
> +return NULL_RTX;

There's a mixture of "return 0" and "return NULL_RTX"; think the latter
is preferred.

> +
> +  set = pc_set (jump);
> +
> +  /* If this branches to JUMP_LABEL when the condition is false,
> + reverse the condition.  */
> +  bool reverse = (GET_CODE (XEXP (SET_SRC (set), 2)) == LABEL_REF
> +  && label_ref_label (XEXP (SET_SRC (set), 2)) == JUMP_LABEL (jump));
> +
> +  /* We may have to reverse because the caller's if block is not canonical,
> + i.e. the THEN block isn't the fallthrough block for the TEST block
> + (see find_if_header).  */
> +  if (then_else_reversed)
> +reverse = !reverse;
> +
> +  rtx cond = XEXP (SET_SRC (set), 0);
> +
> +  code = GET_CODE (cond);
> +  op0 = XEXP (cond, 0);
> +  op1 = XEXP (cond, 1);
> +
> +  if (reverse)
> +code = reversed_comparison_code (cond, jump);
> +  if (code == UNKNOWN)
> +return 0;
> +
> +  /* If constant is first, put it last.  */
> +  if (CONSTANT_P (op0))
> +code = swap_condition (code), tem = op0, op0 = op1, op1 = tem;

Can use std::swap here.  But this should never happen: we're looking
at the operands to a condition in an existing instruction, which should
always be canonical.

> +
> +  /* Never return CC0; return zero instead.  */
> +  if (CC0_P (op0))
> +return 0;
> +
> +  /* We promised to return a comparison.  */
> +  rtx ret = gen_rtx_fmt_ee (code, VOIDmode, op0, op1);
> +  if (COMPARISON_P (ret))
> +return ret;
> +  return 0;
> +}

Without the unnecessary swapping, it looks like the only difference
between this routine and cond_exec_get_condition is the initial
have_cbranchcc4 test and the final COMPARISON_P (ret) test.

The COMPARISON_P test shouldn't really be needed, since I think
all jump (if_then_else ...)s are supposed to have comparisons,
and targets can have no complaints if an if_then_else condition
from a jump gets carried over to a conditional move.  So I think
we can probably drop that too.

I didn't understand why have_cbranchcc4 was needed so can't
comment on that.  But assuming that that can be dropped, it looks
like we could rename cond_exec_get_condition to something more generic
and use it here too.

> +
>  /* Return true if OP is ok for if-then-else processing.  */
>  
>  stati

Re: [PATCH 5/9] ifcvt: Allow constants operands in noce_convert_multiple_sets.

2019-10-27 Thread Richard Sandiford
Coming back to this just in time for it not to be three months later,
sorry...

I still think it would be better to consolidate ifcvt a bit more,
rather than effectively duplicate bits of cond_move_process_if_block
in noce_convert_multiple_sets.  But perhaps it was a historical
mistake to have two separate routines in the first place, and I guess
making noce_convert_multiple_sets handle more cases might allow us
to get rid of cond_move_process_if_block at some point.  So let's
go ahead with this anyway.

Robin Dapp  writes:
> This patch checks allows immediate then/else operands for cmovs.
> We rely on,emit_conditional_move returning NULL if something unsupported
> was generated.
>
> Also, minor refactoring is performed.
>
> --
>
> gcc/ChangeLog:
>
> 2018-11-14  Robin Dapp  
>
>   * ifcvt.c (have_const_cmov): New function.
>   (noce_convert_multiple_sets): Allow constants if supported.
>   (bb_ok_for_noce_convert_multiple_sets): Likewise.
>   (check_cond_move_block): Refactor.
> ---
>  gcc/ifcvt.c | 21 +++--
>  1 file changed, 11 insertions(+), 10 deletions(-)
>
> diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
> index 55205cac153..99716e5f63c 100644
> --- a/gcc/ifcvt.c
> +++ b/gcc/ifcvt.c
> @@ -3214,7 +3214,9 @@ noce_convert_multiple_sets (struct noce_if_info 
> *if_info)
>we'll end up trying to emit r4:HI = cond ? (r1:SI) : (r3:HI).
>Wrap the two cmove operands into subregs if appropriate to prevent
>that.  */
> -  if (GET_MODE (new_val) != GET_MODE (temp))
> +
> +  if (!CONST_INT_P (new_val)
> + && GET_MODE (new_val) != GET_MODE (temp))
>   {
> machine_mode src_mode = GET_MODE (new_val);
> machine_mode dst_mode = GET_MODE (temp);
> @@ -3225,7 +3227,8 @@ noce_convert_multiple_sets (struct noce_if_info 
> *if_info)
>   }
> new_val = lowpart_subreg (dst_mode, new_val, src_mode);
>   }
> -  if (GET_MODE (old_val) != GET_MODE (temp))
> +  if (!CONST_INT_P (old_val)
> + && GET_MODE (old_val) != GET_MODE (temp))
>   {
> machine_mode src_mode = GET_MODE (old_val);
> machine_mode dst_mode = GET_MODE (temp);
> @@ -3362,9 +3365,9 @@ bb_ok_for_noce_convert_multiple_sets (basic_block 
> test_bb, unsigned *cost)
>if (!REG_P (dest))
>   return false;
>  
> -  if (!(REG_P (src)
> -|| (GET_CODE (src) == SUBREG && REG_P (SUBREG_REG (src))
> -&& subreg_lowpart_p (src
> +  if (!((REG_P (src) || (CONST_INT_P (src)))
> + || (GET_CODE (src) == SUBREG && REG_P (SUBREG_REG (src))
> +   && subreg_lowpart_p (src
>   return false;
>  
>/* Destination must be appropriate for a conditional write.  */

CONSTANT_P (as for check_cond_move_block) would cover more cases than
CONST_INT_P.  No need for brackets around CONST_INT_P (...).

I was tempted to say we should use register_operand too, but I guess
that allows unwanted (subreg (mem...))s (as if there's any other kind).

> @@ -3724,7 +3727,7 @@ check_cond_move_block (basic_block bb,
>  {
>rtx set, dest, src;
>  
> -  if (!NONDEBUG_INSN_P (insn) || JUMP_P (insn))
> +  if (!active_insn_p (insn))
>   continue;
>set = single_set (insn);
>if (!set)

I assume this is to skip notes, but shouldn't this still include JUMP_P?

It also looks like it should be an independent patch.

> @@ -3740,10 +3743,8 @@ check_cond_move_block (basic_block bb,
>if (!CONSTANT_P (src) && !register_operand (src, VOIDmode))
>   return FALSE;
>  
> -  if (side_effects_p (src) || side_effects_p (dest))
> - return FALSE;
> -
> -  if (may_trap_p (src) || may_trap_p (dest))
> +  /* Check for side effects and trapping.  */
> +  if (!noce_operand_ok (src) || !noce_operand_ok (dest))
>   return FALSE;
>  
>/* Don't try to handle this if the source register was

Seems wrong to use something called noce_operand_ok in a routine
that is explicitly handling cases in which conditional execution
is possible.

More importantly, I can't see anywhere that handles the MEM_P case
specially for cond_move_*, so are you sure this is really safe?
Might be better to leave it as-is if this part is just a clean-up.

Thanks,
Richard


Re: [SVE] PR91272

2019-10-27 Thread Richard Sandiford
Prathamesh Kulkarni  writes:
> @@ -10288,6 +10261,23 @@ vectorizable_condition (stmt_vec_info stmt_info, 
> gimple_stmt_iterator *gsi,
> vect_finish_stmt_generation (stmt_info, new_stmt, gsi);
> vec_compare = vec_compare_name;
>   }
> +
> +   if (masks)
> + {
> +   unsigned vec_num = vec_oprnds0.length ();
> +   tree loop_mask
> + = vect_get_loop_mask (gsi, masks, vec_num * ncopies,
> +   vectype, vec_num * j + i);
> +   tree tmp2 = make_ssa_name (vec_cmp_type);
> +   gassign *g = gimple_build_assign (tmp2, BIT_AND_EXPR,
> + vec_compare, loop_mask);

Nit: misindented line.

OK with that change, thanks.

Richard


Re: Free ipa-prop edge summaries for inline calls

2019-10-27 Thread Jan Hubicka
> Hi,
> this patch makes ipa-prop to free edge summaries (jump functions) for
> calls which has been inlined because they are no longer useful.
> The main change is to change IPA_EDGE_REF from get_create to get
> and thus we need to watch for missing summaires at some places so
> combining -O0 and -O2 code works.
> 
> Bootstrapped/regtested x86_64-linux, comitted.
> 
>   * ipa-cp.c (propagate_constants_across_call): If args are not available
>   just drop everything to varying.
>   (find_aggregate_values_for_callers_subset): Watch for missing
>   edge summary.
>   (find_more_scalar_values_for_callers_subs): Likewise.
>   * ipa-prop.c (ipa_compute_jump_functions_for_edge,
>   update_jump_functions_after_inlining, propagate_controlled_uses):
>   Watch for missing summaries.
>   (ipa_propagate_indirect_call_infos): Remove summary after propagation
>   is finished.
>   (ipa_write_node_info): Watch for missing summaries.
>   (ipa_read_edge_info): Create new ref.
>   (ipa_edge_args_sum_t): Add remove.
>   (IPA_EDGE_REF_GET_CREATE): New macro.
>   * ipa-fnsummary.c (evaluate_properties_for_edge): Watch for missing
>   edge summary.
>   (remap_edge_change_prob): Likewise.
Sadly this patch breaks lto bootstrap with

0xd8193d ipa_edge_args_sum_t::duplicate(cgraph_edge*, cgraph_edge*, 
ipa_edge_args*, ipa_edge_args*)
../../gcc/ipa-prop.c:3883
0xd8a113 call_summary::symtab_duplication(cgraph_edge*, 
cgraph_edge*, void*)
../../gcc/symbol-summary.h:771
0x9ecbc9 symbol_table::call_edge_duplication_hooks(cgraph_edge*, cgraph_edge*)
../../gcc/cgraph.c:453
0xa0a57a cgraph_edge::clone(cgraph_node*, gcall*, unsigned int, profile_count, 
profile_count, bool)
../../gcc/cgraphclones.c:141
0xa0b5b9 cgraph_node::create_clone(tree_node*, profile_count, bool, 
vec, bool, cgraph_node*, ipa_param_adjustments*, 
char const*)
../../gcc/cgraphclones.c:391
0x1f78f3f clone_inlined_nodes(cgraph_edge*, bool, bool, int*)
../../gcc/ipa-inline-transform.c:221
0x1f78ff4 clone_inlined_nodes(cgraph_edge*, bool, bool, int*)
../../gcc/ipa-inline-transform.c:236
0x1f79c6c inline_call(cgraph_edge*, bool, vec*, 
int*, bool, bool*)
../../gcc/ipa-inline-transform.c:479
0x1f6ae28 inline_small_functions
../../gcc/ipa-inline.c:2151
0x1f6c8e6 ipa_inline
../../gcc/ipa-inline.c:2615
0x1f6d792 execute
../../gcc/ipa-inline.c:3023

So I suppose that the reference counting still makes use of the jump
functions at inlined edges (even though it does not make much sense).
This patch removes the code releasing them until this is sorted out.

Index: ChangeLog
===
--- ChangeLog   (revision 277485)
+++ ChangeLog   (working copy)
@@ -1,3 +1,8 @@
+2019-10-27  Jan Hubicka  
+
+   * ipa-prop.c (ipa_propagate_indirect_call_infos): Do not remove
+   jump functions.
+
 2019-10-27  Eric Botcazou 
 
* cgraph.c (cgraph_node::rtl_info): Fix cut&pasto in comment.
Index: ipa-prop.c
===
--- ipa-prop.c  (revision 277484)
+++ ipa-prop.c  (working copy)
@@ -3706,7 +3706,6 @@ ipa_propagate_indirect_call_infos (struc
 
   propagate_controlled_uses (cs);
   changed = propagate_info_to_inlined_callees (cs, cs->callee, new_edges);
-  ipa_edge_args_sum->remove (cs);
 
   return changed;
 }


[patch,avr,committed] Remove an unused function (PR85969)

2019-10-27 Thread Georg-Johann Lay

Applied as obvious

Johann

PR target/85969
* config/avr/gen-avr-mmcu-specs.c (str_prefix_p): Remove unused
static function.
--- trunk/gcc/config/avr/gen-avr-mmcu-specs.c   2019/10/25 14:39:06 277454
+++ trunk/gcc/config/avr/gen-avr-mmcu-specs.c   2019/10/25 15:13:23 277455
@@ -50,14 +50,6 @@
 #define SPECFILE_USAGE_URL  \
   "https://gcc.gnu.org/gcc-5/changes.html";
 
-/* Return true iff STR starts with PREFIX.  */
-
-static bool
-str_prefix_p (const char *str, const char *prefix)
-{
-  return strncmp (str, prefix, strlen (prefix)) == 0;
-}
-
 
 static const char header[] =
   "#\n"


Re: [PATCH] C testsuite, silence a FreeBSD libc warning

2019-10-27 Thread Kamil Rytarowski
On 26.10.2019 23:57, Andreas Tobler wrote:
> On 04.10.19 19:04, Jeff Law wrote:
>> On 9/30/19 12:47 PM, Andreas Tobler wrote:
>>> On 30.09.19 20:37, Kamil Rytarowski wrote:
 On 30.09.2019 19:47, Jakub Jelinek wrote:
> On Mon, Sep 30, 2019 at 07:41:00PM +0200, Andreas Tobler wrote:
>> --- fprintf-2.c    (revision 276292)
>> +++ fprintf-2.c    (working copy)
>> @@ -1,7 +1,8 @@
>>    /* Verify that calls to fprintf don't get eliminated even if their
>>   result on success can be computed at compile time (they can
>> fail).
>>   The calls can still be transformed into those of other
>> functions.
>> -   { dg-skip-if "requires io" { freestanding } } */
>> +   { dg-skip-if "requires io" { freestanding } }
>> +   { dg-prune-output "(^|\n)(\[^\n\])*warning: warning: \[^\n\]*
>> possibly used unsafely; consider using \[^\n\]*\n" } */
>
> I'm worried about that (^|\n) at the start + \n at the end, doesn't
> it prune
> too much then?
> Looking at other tests, they dg-prune-output just a few words from a
> message, or .*few words.*
> So, can you try just
>  { dg-prune-output "warning: warning: \[^\n\r\]* possibly used
> unsafely; consider using" } */
> or if that doesn't work, with .* at start end end?
>
>  Jakub
>

 Please handle the NetBSD specific string too: "warning: tmpnam()
 possibly used unsafely, use mkstemp() or mkdtemp()".

 https://nxr.netbsd.org/xref/src/lib/libc/stdio/tmpnam.c#52

>>>
>>> Ok, I think the attached version should also match these cases. Although
>>> untested on NetBSD.
>>> Kamil, if you have cycles, would you mind giving it a run? Thanks!
>>> Andreas
>>>
>> OK assuming Kamil's testing shows that it works.
>
> Kamil, do you have a feedback? If not I'm going to commit by tomorrow.
>
> Andreas

Please go for it.


Re: introduce -fcallgraph-info option

2019-10-27 Thread Alexandre Oliva
On Oct 26, 2019, Alexandre Oliva  wrote:

> E.g., the reason we gather expanded calls rather than just use
> cgraph_edges is that the latter would dump several "calls" that are
> builtins expanded internally by the compiler, and would NOT dump other
> ops that are expanded as (lib)calls.

It occurred to me that we could reuse most cgraph edges and avoid
duplicating them in the callees vec, namely those that aren't builtins
or libcalls.  Those that are builtins might or might not become actual
calls, so we disregard the cgraph_edges and record them in the vec
instead.  Those that are libcalls (builtins in a different sense)
introduced during expand are not even present in the cgraph edges, so we
record them in the vec as well.  Here's the patch that implements this.

Regstrapped on x86_64-linux-gnu.  Ok to install?

for  gcc/ChangeLog
>From  Eric Botcazou  , Alexandre Oliva  
>

* common.opt (-fcallgraph-info[=]): New option.
* doc/invoke.texi (Debugging options): Document it.
* opts.c (common_handle_option): Handle it.
* builtins.c (expand_builtin_alloca): Record allocation if
-fcallgraph-info=da.
* calls.c (expand_call): If -fcallgraph-info, record the call.
(emit_library_call_value_1): Likewise.
* flag-types.h (enum callgraph_info_type): New type.
* explow.c: Include stringpool.h.
(set_stack_check_libfunc): Set SET_SYMBOL_REF_DECL on the symbol.
* function.c (allocate_stack_usage_info): New.
(allocate_struct_function): Call it for -fcallgraph-info.
(prepare_function_start): Call it otherwise.
(record_final_call, record_dynamic_alloc): New.
* function.h (struct callinfo_callee): New.
(CALLEE_FROM_CGRAPH_P): New.
(struct callinfo_dalloc): New.
(struct stack_usage): Add callees and dallocs.
(record_final_call, record_dynamic_alloc): Declare.
* gimplify.c (gimplify_decl_expr): Record dynamically-allocated
object if -fcallgraph-info=da.
* optabs-libfuncs.c (build_libfunc_function): Keep SYMBOL_REF_DECL.
* print-tree.h (print_decl_identifier): Declare.
(PRINT_DECL_ORIGIN, PRINT_DECL_NAME, PRINT_DECL_UNIQUE_NAME): New.
* print-tree.c: Include print-tree.h.
(print_decl_identifier): New function.
* toplev.c: Include print-tree.h.
(callgraph_info_file): New global variable.
(callgraph_info_indirect_emitted): Likewise.
(output_stack_usage): Rename to...
(output_stack_usage_1): ... this.  Make it static, add cf
parameter.  If -fcallgraph-info=su, print stack usage to cf.
If -fstack-usage, use print_decl_identifier for
pretty-printing.
(INDIRECT_CALL_NAME): New.
(dump_final_indirect_call_node_vcg): New.
(dump_final_callee_vcg, dump_final_node_vcg): New.
(output_stack_usage): New.
(lang_dependent_init): Open and start file if
-fcallgraph-info.
(finalize): If callgraph_info_file is not null, finish it,
close it, and reset callgraph info state.

for  gcc/ada/ChangeLog

* gcc-interface/misc.c (callgraph_info_file): Delete.
---
 gcc/ada/gcc-interface/misc.c |3 -
 gcc/builtins.c   |4 +
 gcc/calls.c  |6 +
 gcc/common.opt   |8 ++
 gcc/doc/invoke.texi  |   17 
 gcc/explow.c |5 +
 gcc/flag-types.h |   16 
 gcc/function.c   |   59 --
 gcc/function.h   |   30 +++
 gcc/gimplify.c   |4 +
 gcc/optabs-libfuncs.c|4 -
 gcc/opts.c   |   26 ++
 gcc/output.h |2 
 gcc/print-tree.c |   76 ++
 gcc/print-tree.h |4 +
 gcc/toplev.c |  179 ++
 16 files changed, 393 insertions(+), 50 deletions(-)

diff --git a/gcc/ada/gcc-interface/misc.c b/gcc/ada/gcc-interface/misc.c
index 4abd4d5708a54..d68b37384ff7f 100644
--- a/gcc/ada/gcc-interface/misc.c
+++ b/gcc/ada/gcc-interface/misc.c
@@ -54,9 +54,6 @@
 #include "ada-tree.h"
 #include "gigi.h"
 
-/* This symbol needs to be defined for the front-end.  */
-void *callgraph_info_file = NULL;
-
 /* Command-line argc and argv.  These variables are global since they are
imported in back_end.adb.  */
 unsigned int save_argc;
diff --git a/gcc/builtins.c b/gcc/builtins.c
index 5d811f113c907..bd302383377ba 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -5406,6 +5406,10 @@ expand_builtin_alloca (tree exp)
 = allocate_dynamic_stack_space (op0, 0, align, max_size, alloca_for_var);
   result = convert_memory_address (ptr_mode, result);
 
+  /* Dynamic allocations for variables are recorded during gimplification.  */
+  if (!alloca_for_var && (flag_callgraph_info & CALLGRAPH_INFO_DYNAMIC_ALLOC))
+record_dynamic_alloc (exp);
+
   return result;
 }
 
diff --g

[PING] [PATCH] Fix dwarf-lineinfo inconsistency of inlined subroutines

2019-10-27 Thread Bernd Edlinger
Ping...

I'd like to ping for this patch:
https://gcc.gnu.org/ml/gcc-patches/2019-10/msg01459.html


Thanks
Bernd.

On 10/20/19 9:58 PM, Bernd Edlinger wrote:
> Hi,
> 
> this fixes an issue with the gdb step-over aka. "n" command.
> 
> It can be seen when you debug an optimized stage-3 cc1
> it does not affect -O0 code, though.
> 
> This example debug session will explain the effect.
> 
> (gdb) b get_alias_set
> Breakpoint 5 at 0xa099f0: file ../../gcc-trunk/gcc/alias.c, line 837.
> (gdb) r
> Breakpoint 5, get_alias_set (t=t@entry=0x77ff7ab0) at 
> ../../gcc-trunk/gcc/alias.c:837
> 837 if (t == error_mark_node
> (gdb) n
> 839 && (TREE_TYPE (t) == 0 || TREE_TYPE (t) == error_mark_node)))
> (gdb) n
> 3382return __t;  <-- now we have a problem: wrong line info here
> (gdb) bt
> #0  get_alias_set (t=t@entry=0x77ff7ab0) at 
> ../../gcc-trunk/gcc/tree.h:3382
> #1  0x00b25dfe in set_mem_attributes_minus_bitpos 
> (ref=0x7746f990, t=0x77ff7ab0, objectp=1, bitpos=...)
> at ../../gcc-trunk/gcc/emit-rtl.c:1957
> #2  0x01137a55 in make_decl_rtl (decl=0x77ff7ab0) at 
> ../../gcc-trunk/gcc/varasm.c:1518
> #3  0x0113b6e8 in assemble_variable (decl=0x77ff7ab0, 
> top_level=, at_end=, 
> dont_output_data=0) at ../../gcc-trunk/gcc/varasm.c:2246
> #4  0x0113f0ea in varpool_node::assemble_decl (this=0x7745b000) 
> at ../../gcc-trunk/gcc/varpool.c:584
> #5  0x0113fa17 in varpool_node::assemble_decl (this=0x7745b000) 
> at ../../gcc-trunk/gcc/varpool.c:750
> 
> 
> There are at least two problems here:
> 
> First you did not want to step into the TREE_TYPE, but it happens all
> the time, even if you use "n" to step over it.
> 
> And secondly, from the call stack, you don't know where you are in 
> get_alias_set.
> But the code that is executing at this point is actually the x == 0 || x == 
> error_mark_node
> from alias.c, line 839, which contains the inlined body of the TREE_TYPE, but
> the rest of the if.  So there is an inconsistency in the  
> 
> Contents of the .debug_info section:
> 
>  <2><4f686>: Abbrev Number: 12 (DW_TAG_inlined_subroutine)
> <4f687>   DW_AT_abstract_origin: <0x53d4e>
> <4f68b>   DW_AT_entry_pc: 0x7280
> <4f693>   DW_AT_GNU_entry_view: 1
> <4f695>   DW_AT_ranges  : 0xb480
> <4f699>   DW_AT_call_file   : 8  <- alias.c
> <4f69a>   DW_AT_call_line   : 839
> <4f69c>   DW_AT_call_column : 8
> <4f69d>   DW_AT_sibling : <0x4f717>
> 
>  The File Name Table (offset 0x253):
>   8 2   0   0   alias.c
>   102   0   0   tree.h
> 
> Contents of the .debug_ranges section:
> 
> b480 7280 7291 
> b480 2764 277e 
> b480 
> 
> The problem is at pc=0x7291 in the Line Number Section:
> 
>  Line Number Statements:
> 
>   [0x8826]  Special opcode 61: advance Address by 4 to 0x7284 and Line by 
> 0 to 3380
>   [0x8827]  Set is_stmt to 1
>   [0x8828]  Special opcode 189: advance Address by 13 to 0x7291 and Line 
> by 2 to 3382 (*)
>   [0x8829]  Set is_stmt to 0 (**)
>   [0x882a]  Copy (view 1)
>   [0x882b]  Set File Name to entry 8 in the File Name Table <- back to 
> alias.c
>   [0x882d]  Set column to 8
>   [0x882f]  Advance Line by -2543 to 839
>   [0x8832]  Copy (view 2)
>   [0x8833]  Set column to 27
>   [0x8835]  Special opcode 61: advance Address by 4 to 0x7295 and Line by 
> 0 to 839
>   [0x8836]  Set column to 3
>   [0x8838]  Set is_stmt to 1 <-- next line info counts: alias.c:847
>   [0x8839]  Special opcode 153: advance Address by 10 to 0x729f and Line 
> by 8 to 847
>   [0x883a]  Set column to 7
> 
> (*) this line is tree.h:3382, but the program counter is *not* within the 
> subroutine,
> but exactly at the first instruction *after* the subroutine according to the 
> debug_ranges.
> 
> What makes it worse, is that (**) makes gdb ignore the new location info 
> alias.c:839,
> which means, normally the n command would have continued to pc=0x729f, which 
> is at alias.c:847.
> 
> 
> The problem happens due to a block with only var
> This patch fixes this problem by moving (**) to the first statement with a 
> different line number.
> 
> In alias.c.316r.final this looks like that:
> 
> (note 2884 2883 1995 31 0x7f903a931ba0 NOTE_INSN_BLOCK_BEG)
> (note 1995 2884 2885 31 ../../gcc-trunk/gcc/tree.h:3377 
> NOTE_INSN_INLINE_ENTRY)
> (note 2885 1995 1996 31 0x7f903a931c00 NOTE_INSN_BLOCK_BEG)
> [...]
> (note 50 39 59 32 [bb 32] NOTE_INSN_BASIC_BLOCK)
> (note 59 50 60 32 NOTE_INSN_DELETED)
> (note 60 59 1997 32 NOTE_INSN_DELETED)
> (note 1997 60 2239 32 ../../gcc-trunk/gcc/tree.h:3382 NOTE_INSN_BEGIN_STMT)
> (note 2239 1997 2240 32 (var_location __tD.143911 (nil)) 
> NOTE_INSN_VAR_LOCATION)
> (note 2240 2239 2241 32 (var_location __sD.143912 (nil)) 
> NOTE_INSN_VAR_LOCATION)
> (note 2241 2240 2242 32 (var_location __fD.14

Free ipa-prop edge summaries for inline calls

2019-10-27 Thread Jan Hubicka
Hi,
this patch makes ipa-prop to free edge summaries (jump functions) for
calls which has been inlined because they are no longer useful.
The main change is to change IPA_EDGE_REF from get_create to get
and thus we need to watch for missing summaires at some places so
combining -O0 and -O2 code works.

Bootstrapped/regtested x86_64-linux, comitted.

* ipa-cp.c (propagate_constants_across_call): If args are not available
just drop everything to varying.
(find_aggregate_values_for_callers_subset): Watch for missing
edge summary.
(find_more_scalar_values_for_callers_subs): Likewise.
* ipa-prop.c (ipa_compute_jump_functions_for_edge,
update_jump_functions_after_inlining, propagate_controlled_uses):
Watch for missing summaries.
(ipa_propagate_indirect_call_infos): Remove summary after propagation
is finished.
(ipa_write_node_info): Watch for missing summaries.
(ipa_read_edge_info): Create new ref.
(ipa_edge_args_sum_t): Add remove.
(IPA_EDGE_REF_GET_CREATE): New macro.
* ipa-fnsummary.c (evaluate_properties_for_edge): Watch for missing
edge summary.
(remap_edge_change_prob): Likewise.
Index: ipa-cp.c
===
--- ipa-cp.c(revision 277474)
+++ ipa-cp.c(working copy)
@@ -2309,10 +2309,17 @@ propagate_constants_across_call (struct
   callee_info = IPA_NODE_REF (callee);
 
   args = IPA_EDGE_REF (cs);
-  args_count = ipa_get_cs_argument_count (args);
   parms_count = ipa_get_param_count (callee_info);
   if (parms_count == 0)
 return false;
+  if (!args)
+{
+  for (i = 0; i < parms_count; i++)
+   ret |= set_all_contains_variable (ipa_get_parm_lattices (callee_info,
+i));
+  return ret;
+}
+  args_count = ipa_get_cs_argument_count (args);
 
   /* If this call goes through a thunk we must not propagate to the first (0th)
  parameter.  However, we might need to uncover a thunk from below a series
@@ -4066,7 +4073,8 @@ find_more_scalar_values_for_callers_subs
  if (IPA_NODE_REF (cs->caller)->node_dead)
continue;
 
- if (i >= ipa_get_cs_argument_count (IPA_EDGE_REF (cs))
+ if (!IPA_EDGE_REF (cs)
+ || i >= ipa_get_cs_argument_count (IPA_EDGE_REF (cs))
  || (i == 0
  && call_passes_through_thunk_p (cs)))
{
@@ -4135,7 +4143,8 @@ find_more_contexts_for_caller_subset (cg
 
   FOR_EACH_VEC_ELT (callers, j, cs)
{
- if (i >= ipa_get_cs_argument_count (IPA_EDGE_REF (cs)))
+ if (!IPA_EDGE_REF (cs)
+ || i >= ipa_get_cs_argument_count (IPA_EDGE_REF (cs)))
return;
  ipa_jump_func *jfunc = ipa_get_ith_jump_func (IPA_EDGE_REF (cs),
i);
@@ -4451,6 +4460,11 @@ find_aggregate_values_for_callers_subset
 
   FOR_EACH_VEC_ELT (callers, j, cs)
 {
+  if (!IPA_EDGE_REF (cs))
+   {
+ count = 0;
+ break;
+   }
   int c = ipa_get_cs_argument_count (IPA_EDGE_REF (cs));
   if (c < count)
count = c;
Index: ipa-fnsummary.c
===
--- ipa-fnsummary.c (revision 277474)
+++ ipa-fnsummary.c (working copy)
@@ -452,6 +452,7 @@ evaluate_properties_for_edge (struct cgr
   class ipa_fn_summary *info = ipa_fn_summaries->get (callee);
   vec known_vals = vNULL;
   vec known_aggs = vNULL;
+  class ipa_edge_args *args;
 
   if (clause_ptr)
 *clause_ptr = inline_p ? 0 : 1 << predicate::not_inlined_condition;
@@ -462,10 +463,10 @@ evaluate_properties_for_edge (struct cgr
 
   if (ipa_node_params_sum
   && !e->call_stmt_cannot_inline_p
-  && ((clause_ptr && info->conds) || known_vals_ptr || known_contexts_ptr))
+  && ((clause_ptr && info->conds) || known_vals_ptr || known_contexts_ptr)
+  && (args = IPA_EDGE_REF (e)) != NULL)
 {
   class ipa_node_params *caller_parms_info, *callee_pi;
-  class ipa_edge_args *args = IPA_EDGE_REF (e);
   class ipa_call_summary *es = ipa_call_summaries->get (e);
   int i, count = ipa_get_cs_argument_count (args);
 
@@ -3160,6 +3161,8 @@ remap_edge_change_prob (struct cgraph_ed
 {
   int i;
   class ipa_edge_args *args = IPA_EDGE_REF (edge);
+  if (!args)
+   return;
   class ipa_call_summary *es = ipa_call_summaries->get (edge);
   class ipa_call_summary *inlined_es
= ipa_call_summaries->get (inlined_edge);
Index: ipa-prop.c
===
--- ipa-prop.c  (revision 277474)
+++ ipa-prop.c  (working copy)
@@ -1854,7 +1854,7 @@ ipa_compute_jump_functions_for_edge (str
 struct cgraph_edge *cs)
 {
   class ipa_node_params *info = IPA_NODE_REF (cs->caller);
-  class ipa_

Fix summaries for expanded thunks

2019-10-27 Thread Jan Hubicka
Hi,
this is similar to previous patch - when expanding thunk summaries needs
to be recomputed.  Bootstrapped/regtested x86_64-linux, comitted.

Honza

* ipa-inline-transform.c (inline_call): update function summaries
after expanidng thunk.
Index: ipa-inline-transform.c
===
--- ipa-inline-transform.c  (revision 277474)
+++ ipa-inline-transform.c  (working copy)
@@ -352,12 +352,14 @@ inline_call (struct cgraph_edge *e, bool
   if (to->thunk.thunk_p)
 {
   struct cgraph_node *target = to->callees->callee;
+  symtab->call_cgraph_removal_hooks (to);
   if (in_lto_p)
to->get_untransformed_body ();
   to->expand_thunk (false, true);
   /* When thunk is instrumented we may have multiple callees.  */
   for (e = to->callees; e && e->callee != target; e = e->next_callee)
;
+  symtab->call_cgraph_insertion_hooks (to);
   gcc_assert (e);
 }
 


Re: [Patch, Fortran] PR91863 - fix call to bind(C) with array descriptor

2019-10-27 Thread Paul Richard Thomas
Hi Tobias,

Thanks for taking care of this. OK for trunk and 9-branch.

Cheers

Paul

On Wed, 23 Oct 2019 at 14:07, Tobias Burnus  wrote:
>
> With the trunk, there are three issues:
>
> (a) With bind(C), the callee side handles deallocation with intent(out).
>
> This should produce the code:
>  if (cfi.0 != 0B)
>{
>  __builtin_free (cfi.0);
>  cfi.0 = 0B;
>}
> This fails as cfi.0 (of type 'void*') is dereferenced and
> *cfi.0 = 0B' (i.e. assignment of type 'void') causes the ICE.
>
>
> (b) With that fixed, one gets:
>  sub (cfi.4);
>  _gfortran_cfi_desc_to_gfc_desc (&a, &cfi.4);
>  if (cfi.4 != 0B)
>__builtin_free (cfi.4);
>  ... code using "a" ...
> That also won't shine as 'a.data' == 'cfi.4'; hence, one
> accesses already freed memory.
>
> I don't see whether freeing the cfi memory makes sense at all;
> as I didn't come up with a reason, I removed it for good.
>
>
> Those issues, I have solved. The third issue is now PR fortran/92189:
> (c) When allocating memory in a Fortran-written Bind(C) function, the
> shape/bounds changes are not propagated back to Fortran.
> Namely, "sub" lacks some _gfortran_gfc_desc_to_cfi_desc call at the end!
>
> The issue pops up, if you change 'dg-do compile' into 'dg-do run'. For
> using a C-written function, that's a non-issue. Hence, it makes sense
> to fix (a)+(b) of the bug separately.
>
>
> OK for the trunk and GCC 9? (At least the ICE is a regression.)
>
> Tobias
>


-- 
"If you can't explain it simply, you don't understand it well enough"
- Albert Einstein


Re: [C++ PATCH] PR c++/90875 - added -Wswitch-outside-range option.

2019-10-27 Thread Gerald Pfeifer
On Fri, 21 Jun 2019, Marek Polacek wrote:
>> 2019-06-20  Matthew Beliveau  
>> 
>>  PR c++/90875 - added -Wswitch-outside-range option
>>  * doc/invoke.texi (Wswitch-outside-range): Document.

I noticed this is not yet covered in the GCC 10 release notes at
https://gcc.gnu.org/gcc-10/changes.html .

https://gcc.gnu.org/about.html has some information how to go about
our web pages (in GIT now, the CVS era is over ;-), and I'm happy to
help if you have any questions.

Gerald