Re: [PATCH 02/10] [i386] Enable _Float16 type for TARGET_SSE2 and above.

2021-07-28 Thread Hongtao Liu via Gcc-patches
On Thu, Jul 29, 2021 at 12:53 PM Hongtao Liu  wrote:
>
> On Thu, Jul 29, 2021 at 5:57 AM Joseph Myers  wrote:
> >
> > On Wed, 21 Jul 2021, liuhongt via Gcc-patches wrote:
> >
> > > @@ -23254,13 +23337,15 @@ ix86_get_excess_precision (enum 
> > > excess_precision_type type)
> > >  provide would be identical were it not for the unpredictable
> > >  cases.  */
> > >   if (!TARGET_80387)
> > > -   return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT;
> > > +   return TARGET_SSE2
> > > +  ? FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16
> > > +  : FLT_EVAL_METHOD_PROMOTE_TO_FLOAT;
> > >   else if (!TARGET_MIX_SSE_I387)
> > > {
> > >   if (!(TARGET_SSE && TARGET_SSE_MATH))
> > > return FLT_EVAL_METHOD_PROMOTE_TO_LONG_DOUBLE;
> > >   else if (TARGET_SSE2)
> > > -   return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT;
> > > +   return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16;
> > > }
> > >
> > >   /* If we are in standards compliant mode, but we know we will
> >
> > This patch is not changing the default "fast" mode at all; that's
> > promoting to float, unconditionally.  But you have a subsequent change
> > there in patch 4 to make the promotions in the default "fast" mode depend
> > on hardware support for the new instructions; it's unhelpful for the
> > documentation not to corresponding exactly to the code changes in the same
> > patch.
> Yes, will change.
> >
> > Rather than using FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 whenever TARGET_SSE2
> > (i.e. whenever the type is available), it might make more sense to follow
> > AArch64 and use it only when the hardware instructions are available.  In
> > any case, it seems peculiar to use a different threshold in the "fast"
>   We want to provide some debuggability to the software emulation.
> When there's inconsistency between software emulation and hardware
> instructions, users can still debug on non-avx512fp16 processor w/
> software emulation and extra option -fexcess-precision=standard,
> Also since TARGET_C_EXCESS_PRECISION is not related to type, for
> testcase w/o _Float16 and is supposed to be runned on x86 fpu, if gcc
> is built w/ --with-arch=sapphirerapid, it will regress those
> testcases. .i.e. gcc.target/i386/excess-precision-*.c, that's why we
> can't follow AArch64.
> > case from the "standard" case.  -fexcess-precision=standard is not "avoid
> > excess precision", it's "implement excess precision in the front end".
> > Whenever "fast" is implementing excess precision in the front end,
> > "standard" should be doing the same thing as "fast".
> >
> > > +Soft-fp keeps the intermediate result of the operation at 32-bit 
> > > precision by defaults,
> > > +which may lead to inconsistent behavior between soft-fp and avx512fp16 
> > > instructions,
> > > +using @option{-fexcess-precision=standard} will force round back after 
> > > every operation.
> >
> > "soft-fp" is, as the name of some code within GCC, an internal
> > implementation detail, which should not be referenced in the user manual.
> > What results in intermediate results being in a wider precision is not
> > soft-fp; it's promotions inserted by the front end as a result of how the
> > above hook is defined (promotions inserted by the optabs/expand code are
> > an implementation detail that should always be followed automatically by a
> > truncation of the result and so not be user-visible).
> Yes, will reorganize the words.
> >
> > As far as I know, the official name of "avx512fp16" is "AVX512-FP16" and
> > text in the manual should use the official capitalization, hyphenation
> > etc. in such names unless literally referring to command-line options
> > inside @option or similar.
> Yes, will change.
> >
Update patch for documents.
> > --
> > Joseph S. Myers
> > jos...@codesourcery.com
>
>
>
> --
> BR,
> Hongtao

Also as a follow up of [1], I merge the below change into the updated patch.
Richard, please comment under this thread.
> > > +  /* FIXME: validate_subreg only allows (subreg:WORD_MODE (reg:HF) 0). */
> >
> > I think that needs "fixing" then, or alternatively the caller should care.
> >
> How about this
>
> modified   gcc/emit-rtl.c
> @@ -928,6 +928,10 @@ validate_subreg (machine_mode omode, machine_mode imode,
>   fix them all.  */
>if (omode == word_mode)
>  ;
> +  /* ???Similarly like (subreg:DI (reg:SF), also allow (subreg:SI (reg:HF))
> + here. Though extract_bit_field is the culprit here, not the backends.  
> */
> +  else if (imode == HFmode && omode == SImode)
> +;
>/* ??? Similarly, e.g. with (subreg:DF (reg:TI)).  Though store_bit_field
>   is the culprit here, and not the backends.  */
>else if (known_ge (osize, regsize) && known_ge (isize, osize))
> new file   gcc/testsuite/gcc.target/i386/float16-5.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-options "-msse2 -O2" } */
> +_Float16
> +foo (int a)
> +{
> +  union {
> +int a;
> +_Float16 b;
> +  }c;

Re: [PATCH 02/10] [i386] Enable _Float16 type for TARGET_SSE2 and above.

2021-07-28 Thread Hongtao Liu via Gcc-patches
On Thu, Jul 29, 2021 at 5:57 AM Joseph Myers  wrote:
>
> On Wed, 21 Jul 2021, liuhongt via Gcc-patches wrote:
>
> > @@ -23254,13 +23337,15 @@ ix86_get_excess_precision (enum 
> > excess_precision_type type)
> >  provide would be identical were it not for the unpredictable
> >  cases.  */
> >   if (!TARGET_80387)
> > -   return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT;
> > +   return TARGET_SSE2
> > +  ? FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16
> > +  : FLT_EVAL_METHOD_PROMOTE_TO_FLOAT;
> >   else if (!TARGET_MIX_SSE_I387)
> > {
> >   if (!(TARGET_SSE && TARGET_SSE_MATH))
> > return FLT_EVAL_METHOD_PROMOTE_TO_LONG_DOUBLE;
> >   else if (TARGET_SSE2)
> > -   return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT;
> > +   return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16;
> > }
> >
> >   /* If we are in standards compliant mode, but we know we will
>
> This patch is not changing the default "fast" mode at all; that's
> promoting to float, unconditionally.  But you have a subsequent change
> there in patch 4 to make the promotions in the default "fast" mode depend
> on hardware support for the new instructions; it's unhelpful for the
> documentation not to corresponding exactly to the code changes in the same
> patch.
Yes, will change.
>
> Rather than using FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 whenever TARGET_SSE2
> (i.e. whenever the type is available), it might make more sense to follow
> AArch64 and use it only when the hardware instructions are available.  In
> any case, it seems peculiar to use a different threshold in the "fast"
  We want to provide some debuggability to the software emulation.
When there's inconsistency between software emulation and hardware
instructions, users can still debug on non-avx512fp16 processor w/
software emulation and extra option -fexcess-precision=standard,
Also since TARGET_C_EXCESS_PRECISION is not related to type, for
testcase w/o _Float16 and is supposed to be runned on x86 fpu, if gcc
is built w/ --with-arch=sapphirerapid, it will regress those
testcases. .i.e. gcc.target/i386/excess-precision-*.c, that's why we
can't follow AArch64.
> case from the "standard" case.  -fexcess-precision=standard is not "avoid
> excess precision", it's "implement excess precision in the front end".
> Whenever "fast" is implementing excess precision in the front end,
> "standard" should be doing the same thing as "fast".
>
> > +Soft-fp keeps the intermediate result of the operation at 32-bit precision 
> > by defaults,
> > +which may lead to inconsistent behavior between soft-fp and avx512fp16 
> > instructions,
> > +using @option{-fexcess-precision=standard} will force round back after 
> > every operation.
>
> "soft-fp" is, as the name of some code within GCC, an internal
> implementation detail, which should not be referenced in the user manual.
> What results in intermediate results being in a wider precision is not
> soft-fp; it's promotions inserted by the front end as a result of how the
> above hook is defined (promotions inserted by the optabs/expand code are
> an implementation detail that should always be followed automatically by a
> truncation of the result and so not be user-visible).
Yes, will reorganize the words.
>
> As far as I know, the official name of "avx512fp16" is "AVX512-FP16" and
> text in the manual should use the official capitalization, hyphenation
> etc. in such names unless literally referring to command-line options
> inside @option or similar.
Yes, will change.
>
> --
> Joseph S. Myers
> jos...@codesourcery.com



-- 
BR,
Hongtao


[PATCH] Objective-C: don't require redundant -fno-objc-sjlj-exceptions for the NeXT v2 ABI

2021-07-28 Thread Matt Jacobson via Gcc-patches
As is, an invocation of GCC with -fnext-runtime -fobjc-abi-version=2 crashes, 
unless target-specific code adds an implicit -fno-objc-sjlj-exceptions (which 
Darwin does).

This patch makes the general case not crash.

I don't have commit access, so if this patch is suitable, I'd need someone else
to commit it for me.  Thanks.

gcc/objc/ChangeLog:

2021-07-28  Matt Jacobson  

* objc-next-runtime-abi-02.c (objc_next_runtime_abi_02_init): Warn
about and reset flag_objc_sjlj_exceptions regardless of
flag_objc_exceptions.


gcc/c-family/ChangeLog:

2021-07-28  Matt Jacobson  

* c-opts.c (c_common_post_options): Default to
flag_objc_sjlj_exceptions = 1 only when flag_objc_abi < 2.

diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c
index c51d6d34726..2568df67972 100644
--- a/gcc/c-family/c-opts.c
+++ b/gcc/c-family/c-opts.c
@@ -840,9 +840,9 @@ c_common_post_options (const char **pfilename)
   else if (!flag_gnu89_inline && !flag_isoc99)
 error ("%<-fno-gnu89-inline%> is only supported in GNU99 or C99 mode");
 
-  /* Default to ObjC sjlj exception handling if NeXT runtime.  */
+  /* Default to ObjC sjlj exception handling if NeXT  (SIZEHASHTABLE);
 
-  if (flag_objc_exceptions && flag_objc_sjlj_exceptions)
+  if (flag_objc_sjlj_exceptions)
 {
   inform (UNKNOWN_LOCATION,
  "%<-fobjc-sjlj-exceptions%> is ignored for "



Re: [PATCH] c++tools, configury: Configure with C++; test checking status [PR98821].

2021-07-28 Thread Jason Merrill via Gcc-patches

On 7/20/21 11:21 AM, Iain Sandoe wrote:

Hi Folks,

Following Jakub’s suggestions (on irc) here is a patch that works around
misconfiguration of the c++tools directory present for at least Linux and Darwin
(probably on any platform that does not have typedefs for the inet structs in 
its
  system headers).

This also pulls in tests for the checking configure flags (copied from libcpp) 
and the
implementations of gcc_assert (copied from gcc).  Actually, there’s not much 
original
code here - but the combination is new, of course.

Tested lightly on Linux and Darwin for master w/wout —disable-checking and on
gcc-11 with default (release).  At least the configures now seem to DTRT for 
those.

OK for master and GCC-11.2?
  (if a complete regtest for passes for both)


OK.


thanks
Iain




The c++tools configure fragments need to be built with a C++ compiler.

In addition, the stand-alone server uses diagnostic mechanisms in common
with GCC, but needs to define implementations of the asserts and
supporting output functions.

Signed-off-by: Iain Sandoe 

PR c++/98821 - modules : c++tools configures with CC but code fragments assume 
CXX.

PR c++/98821

c++tools/ChangeLog:

* config.h.in: Regenerate.
* configure: Regenerate.
* configure.ac: Configure using C++.  Pull logic to
detect enabled checking modes.
* server.cc (AI_NUMERICSERV): Define a fallback value.
(gcc_assert): New.
(gcc_checking_assert): New.
(gcc_unreachable): New.
(fancy_abort): Only build when checking is enabled.

Co-authored-by: Jakub Jelinek 
---
  c++tools/config.h.in  |  10 +
  c++tools/configure| 766 +++---
  c++tools/configure.ac |  58 
  c++tools/server.cc|  35 ++
  4 files changed, 228 insertions(+), 641 deletions(-)

diff --git a/c++tools/configure.ac b/c++tools/configure.ac
index 70fcb641db9..cb67dabf191 100644
--- a/c++tools/configure.ac
+++ b/c++tools/configure.ac
@@ -41,6 +41,8 @@ MISSING=`cd $ac_aux_dir && ${PWDCMD-pwd}`/missing
  AC_CHECK_PROGS([AUTOCONF], [autoconf], [$MISSING autoconf])
  AC_CHECK_PROGS([AUTOHEADER], [autoheader], [$MISSING autoheader])
  
+AC_LANG(C++)

+
  dnl Enabled by default
  AC_MSG_CHECKING([whether to build C++ tools])
AC_ARG_ENABLE(c++-tools,
@@ -67,6 +69,62 @@ AC_MSG_RESULT([$maintainer_mode])
  test "$maintainer_mode" = yes && MAINTAINER=yes
  AC_SUBST(MAINTAINER)
  
+# Enable expensive internal checks

+is_release=
+if test -f $srcdir/../gcc/DEV-PHASE \
+   && test x"`cat $srcdir/../gcc/DEV-PHASE`" != xexperimental; then
+  is_release=yes
+fi
+
+AC_ARG_ENABLE(checking,
+[AS_HELP_STRING([[--enable-checking[=LIST]]],
+   [enable expensive run-time checks.  With LIST,
+enable only specific categories of checks.
+Categories are: yes,no,all,none,release.
+Flags are: misc,valgrind or other strings])],
+[ac_checking_flags="${enableval}"],[
+# Determine the default checks.
+if test x$is_release = x ; then
+  ac_checking_flags=yes
+else
+  ac_checking_flags=release
+fi])
+IFS="${IFS=   }"; ac_save_IFS="$IFS"; IFS="$IFS,"
+for check in release $ac_checking_flags
+do
+   case $check in
+   # these set all the flags to specific states
+   yes|all) ac_checking=1 ; ac_assert_checking=1 ; ac_valgrind_checking= ;;
+   no|none) ac_checking= ; ac_assert_checking= ; ac_valgrind_checking= ;;
+   release) ac_checking= ; ac_assert_checking=1 ; ac_valgrind_checking= ;;
+   # these enable particular checks
+   assert) ac_assert_checking=1 ;;
+   misc) ac_checking=1 ;;
+   valgrind) ac_valgrind_checking=1 ;;
+   # accept
+   *) ;;
+   esac
+done
+IFS="$ac_save_IFS"
+
+if test x$ac_checking != x ; then
+  AC_DEFINE(CHECKING_P, 1,
+[Define to 1 if you want more run-time sanity checks.])
+else
+  AC_DEFINE(CHECKING_P, 0)
+fi
+
+if test x$ac_assert_checking != x ; then
+  AC_DEFINE(ENABLE_ASSERT_CHECKING, 1,
+[Define if you want assertions enabled.  This is a cheap check.])
+fi
+
+if test x$ac_valgrind_checking != x ; then
+  AC_DEFINE(ENABLE_VALGRIND_CHECKING, 1,
+[Define if you want to workaround valgrind (a memory checker) warnings about
+ possible memory leaks because of libcpp use of interior pointers.])
+fi
+
  # Check whether --enable-default-pie was given.
  AC_ARG_ENABLE(default-pie,
  [AS_HELP_STRING([--enable-default-pie],
diff --git a/c++tools/server.cc b/c++tools/server.cc
index fae3e78dc5d..3056352e24b 100644
--- a/c++tools/server.cc
+++ b/c++tools/server.cc
@@ -61,6 +61,10 @@ along with GCC; see the file COPYING3.  If not see
  # define gai_strerror(X) ""
  #endif
  
+#ifndef AI_NUMERICSERV

+#define AI_NUMERICSERV 0
+#endif
+
  #include 
  
  // Select or epoll

@@ -92,6 +96,35 @@ along with GCC; see the file COPYING3.  If not see
  #define DIR_SEPARATOR '/'
  #endif
  
+/* Imported from libcpp/system.h

+   Use gcc_assert(EXPR) to test invariants.  */
+#if 

[PATCH] Adjust/Refine testcases.

2021-07-28 Thread liuhongt via Gcc-patches
  Committed as obvious fix, and opened pr101668 to record the issue related
to pr92658-{avx512bw-2,sse4-2,sse4}.c.

gcc/testsuite/ChangeLog:

PR target/99881
* gcc.target/i386/pr91446.c: Adjust testcase.
* gcc.target/i386/pr92658-avx512bw-2.c: Ditto.
* gcc.target/i386/pr92658-sse4-2.c: Ditto.
* gcc.target/i386/pr92658-sse4.c: Ditto.
* gcc.target/i386/pr99881.c: Refine testcase.
---
 gcc/testsuite/gcc.target/i386/pr91446.c| 2 +-
 gcc/testsuite/gcc.target/i386/pr92658-avx512bw-2.c | 2 +-
 gcc/testsuite/gcc.target/i386/pr92658-sse4-2.c | 2 +-
 gcc/testsuite/gcc.target/i386/pr92658-sse4.c   | 2 +-
 gcc/testsuite/gcc.target/i386/pr99881.c| 4 ++--
 5 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/gcc/testsuite/gcc.target/i386/pr91446.c 
b/gcc/testsuite/gcc.target/i386/pr91446.c
index f7c4bea616d..0243ca3ea68 100644
--- a/gcc/testsuite/gcc.target/i386/pr91446.c
+++ b/gcc/testsuite/gcc.target/i386/pr91446.c
@@ -1,5 +1,5 @@
 /* { dg-do compile { target { ! ia32 } } } */
-/* { dg-options "-O2 -march=skylake -ftree-slp-vectorize 
-mtune-ctrl=^sse_typeless_stores" } */
+/* { dg-options "-O2 -march=icelake-server -ftree-slp-vectorize 
-mtune-ctrl=^sse_typeless_stores" } */
 
 typedef struct
 {
diff --git a/gcc/testsuite/gcc.target/i386/pr92658-avx512bw-2.c 
b/gcc/testsuite/gcc.target/i386/pr92658-avx512bw-2.c
index 33eecbf3afa..3176f85ee6b 100644
--- a/gcc/testsuite/gcc.target/i386/pr92658-avx512bw-2.c
+++ b/gcc/testsuite/gcc.target/i386/pr92658-avx512bw-2.c
@@ -1,6 +1,6 @@
 /* PR target/92658 */
 /* { dg-do compile } */
-/* { dg-options "-O2 -ftree-vectorize -mavx512bw -mprefer-vector-width=512" } 
*/
+/* { dg-options "-O2 -mtune=icelake-server -ftree-vectorize -mavx512bw 
-mprefer-vector-width=512" } */
 
 typedef char v64qi __attribute__((vector_size (64)));
 typedef short v32hi __attribute__((vector_size (64)));
diff --git a/gcc/testsuite/gcc.target/i386/pr92658-sse4-2.c 
b/gcc/testsuite/gcc.target/i386/pr92658-sse4-2.c
index 53e89ad1052..a1cf9e78f6c 100644
--- a/gcc/testsuite/gcc.target/i386/pr92658-sse4-2.c
+++ b/gcc/testsuite/gcc.target/i386/pr92658-sse4-2.c
@@ -1,6 +1,6 @@
 /* PR target/92658 */
 /* { dg-do compile } */
-/* { dg-options "-O2 -ftree-vectorize -msse4.1" } */
+/* { dg-options "-O2 -mtune=icelake-server -ftree-vectorize -msse4.1" } */
 
 typedef char v16qi __attribute__((vector_size (16)));
 typedef short v8hi __attribute__((vector_size (16)));
diff --git a/gcc/testsuite/gcc.target/i386/pr92658-sse4.c 
b/gcc/testsuite/gcc.target/i386/pr92658-sse4.c
index e12e1639b7d..9fd2eeeccab 100644
--- a/gcc/testsuite/gcc.target/i386/pr92658-sse4.c
+++ b/gcc/testsuite/gcc.target/i386/pr92658-sse4.c
@@ -1,6 +1,6 @@
 /* PR target/92658 */
 /* { dg-do compile } */
-/* { dg-options "-O2 -ftree-vectorize -msse4.1" } */
+/* { dg-options "-O2 -mtune=icelake-server -ftree-vectorize -msse4.1" } */
 
 typedef unsigned char v16qi __attribute__((vector_size (16)));
 typedef unsigned short v8hi __attribute__((vector_size (16)));
diff --git a/gcc/testsuite/gcc.target/i386/pr99881.c 
b/gcc/testsuite/gcc.target/i386/pr99881.c
index 7ae51c8310d..a1ec1d1ba8a 100644
--- a/gcc/testsuite/gcc.target/i386/pr99881.c
+++ b/gcc/testsuite/gcc.target/i386/pr99881.c
@@ -1,7 +1,7 @@
 /* PR target/99881.  */
-/* { dg-do compile } */
+/* { dg-do compile { target { ! ia32 } } } */
 /* { dg-options "-Ofast -march=skylake" } */
-/* { dg-final { scan-assembler-not "xmm[0-9]" } } */
+/* { dg-final { scan-assembler-not "xmm\[0-9\]" } } */
 
 void
 foo (int* __restrict a, int n, int c)
-- 
2.27.0



Re: [PATCH] [i386] Add a separate function to calculate cost for WIDEN_MULT_EXPR.

2021-07-28 Thread Hongtao Liu via Gcc-patches
On Wed, Jul 28, 2021 at 8:36 PM Richard Biener
 wrote:
>
> On Wed, Jul 28, 2021 at 10:35 AM liuhongt  wrote:
> >
> > Hi:
> >   As described in PR 39821, WIDEN_MULT_EXPR should use a different cost
> > model from MULT_EXPR, this patch add ix86_widen_mult_cost for that.
> > Reference basis for the cost model is https://godbolt.org/z/EMjaz4Knn.
> >
> >   Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
> >
> > gcc/ChangeLog:
>
> can you reference PR target/39821 please?
>
Added.
> > * config/i386/i386.c (ix86_widen_mult_cost): New function.
> > (ix86_add_stmt_cost): Use ix86_widen_mult_cost for
> > WIDEN_MULT_EXPR.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/i386/sse2-pr39821.c: New test.
> > * gcc.target/i386/sse4-pr39821.c: New test.
> > ---
> >  gcc/config/i386/i386.c   | 48 +++-
> >  gcc/testsuite/gcc.target/i386/sse2-pr39821.c | 45 ++
> >  gcc/testsuite/gcc.target/i386/sse4-pr39821.c |  4 ++
> >  3 files changed, 96 insertions(+), 1 deletion(-)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-pr39821.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/sse4-pr39821.c
> >
> > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> > index 876a19f4c1f..281b5fe2706 100644
> > --- a/gcc/config/i386/i386.c
> > +++ b/gcc/config/i386/i386.c
> > @@ -19757,6 +19757,44 @@ ix86_vec_cost (machine_mode mode, int cost)
> >return cost;
> >  }
> >
> > +/* Return cost of vec_widen_mult_hi/lo_,
> > +   vec_widen_mul_hi/lo_ is only available for VI124_AVX2.  */
> > +static int
> > +ix86_widen_mult_cost (const struct processor_costs *cost,
> > + enum machine_mode mode, bool uns_p)
> > +{
> > +  gcc_assert (GET_MODE_CLASS (mode) == MODE_VECTOR_INT);
> > +  int extra_cost = 0;
> > +  int basic_cost = 0;
> > +  switch (mode)
> > +{
> > +case V8HImode:
> > +case V16HImode:
> > +  if (!uns_p || mode == V16HImode)
> > +   extra_cost = cost->sse_op * 2;
> > +  basic_cost = cost->mulss * 2 + cost->sse_op * 4;
> > +  break;
> > +case V4SImode:
> > +case V8SImode:
> > +  /* pmulhw/pmullw can be used.  */
> > +  basic_cost = cost->mulss * 2 + cost->sse_op * 2;
> > +  break;
> > +case V2DImode:
> > +  /* pmuludq under sse2, pmuldq under sse4.1, for sign_extend,
> > +require extra 4 mul, 4 add, 4 cmp and 2 shift.  */
> > +  if (!TARGET_SSE4_1 && !uns_p)
> > +   extra_cost = (cost->mulss + cost->addss + cost->sse_op) * 4
> > + + cost->sse_op * 2;
> > +  /* Fallthru.  */
> > +case V4DImode:
> > +  basic_cost = cost->mulss * 2 + cost->sse_op * 4;
> > +  break;
> > +default:
> > +  gcc_unreachable();
> > +}
> > +  return ix86_vec_cost (mode, basic_cost + extra_cost);
> > +}
> > +
> >  /* Return cost of multiplication in MODE.  */
> >
> >  static int
> > @@ -22483,10 +22521,18 @@ ix86_add_stmt_cost (class vec_info *vinfo, void 
> > *data, int count,
> >   break;
> >
> > case MULT_EXPR:
> > -   case WIDEN_MULT_EXPR:
> > + /*For MULT_HIGHPART_EXPR, x86 only supports pmulhw,
>
> Space after /*
>

Changed.

> otherwise OK.

Thanks for the review, attach the patch i'm going to check in.
>
> > +   take it as MULT_EXPR.  */
> > case MULT_HIGHPART_EXPR:
> >   stmt_cost = ix86_multiplication_cost (ix86_cost, mode);
> >   break;
> > + /* There's no direct instruction for WIDEN_MULT_EXPR,
> > +take emulation into account.  */
> > +   case WIDEN_MULT_EXPR:
> > + stmt_cost = ix86_widen_mult_cost (ix86_cost, mode,
> > +   TYPE_UNSIGNED (vectype));
> > + break;
> > +
> > case NEGATE_EXPR:
> >   if (SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH)
> > stmt_cost = ix86_cost->sse_op;
> > diff --git a/gcc/testsuite/gcc.target/i386/sse2-pr39821.c 
> > b/gcc/testsuite/gcc.target/i386/sse2-pr39821.c
> > new file mode 100644
> > index 000..bcd4b772c98
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/sse2-pr39821.c
> > @@ -0,0 +1,45 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-msse2 -mno-sse4.1 -O3 -fdump-tree-vect-details" } */
> > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 5 
> > "vect" } } */
> > +#include
> > +void
> > +vec_widen_smul8 (int16_t* __restrict v3, int8_t *v1, int8_t *v2, int order)
> > +{
> > +  while (order--)
> > +*v3++ = (int16_t) *v1++ * *v2++;
> > +}
> > +
> > +void
> > +vec_widen_umul8(uint16_t* __restrict v3, uint8_t *v1, uint8_t *v2, int 
> > order)
> > +{
> > +  while (order--)
> > +*v3++ = (uint16_t) *v1++ * *v2++;
> > +}
> > +
> > +void
> > +vec_widen_smul16(int32_t* __restrict v3, int16_t *v1, int16_t *v2, int 
> > order)
> > +{
> > +  while (order--)
> > +*v3++ = (int32_t) *v1++ * *v2++;
> > +}
> > +
> > +void
> > +vec_widen_umul16(uint32_t* 

Re: Repost: [PATCH] PR 100168: Fix call test on power10.

2021-07-28 Thread Segher Boessenkool
On Wed, Jul 07, 2021 at 04:08:39PM -0400, Michael Meissner wrote:
> [PATCH] PR 100168: Fix call test on power10.
> 
> Fix a test that was checking for 64-bit TOC calls, to also allow for
> PC-relative calls.

> --- a/gcc/testsuite/gcc.dg/pr56727-2.c
> +++ b/gcc/testsuite/gcc.dg/pr56727-2.c
> @@ -18,4 +18,4 @@ void h ()
>  
>  /* { dg-final { scan-assembler "@(PLT|plt)" { target i?86-*-* x86_64-*-* } } 
> } */
>  /* { dg-final { scan-assembler "@(PLT|plt)" { target { powerpc*-*-linux* && 
> ilp32 } } } } */
> -/* { dg-final { scan-assembler "bl f\n\\s*nop" { target { powerpc*-*-linux* 
> && lp64 } } } } */
> +/* { dg-final { scan-assembler "(bl f\n\\s*nop)|(bl f@notoc)" { target { 
> powerpc*-*-linux* && lp64 } } } } */

The parentheses are superfluous.  Maybe just write it as
{ dg-final { scan-assembler {bl f(\n\s*nop|@notoc)} { target { 
powerpc*-*-linux* && lp64 } } } } */
though?


Segher


Re: [Patch] gfortran.dg/dg.exp: Add libgfortran as -I flag for ISO*.h [PR101305] (was: [PATCH 3/3] [PR libfortran/101305] Fix ISO_Fortran_binding.h paths in gfortran testsuite)

2021-07-28 Thread Jakub Jelinek via Gcc-patches
On Wed, Jul 28, 2021 at 01:22:53PM +0200, Tobias Burnus wrote:
> gfortran.dg/dg.exp: Add libgfortran as -I flag for ISO*.h [PR101305]
> 
> gcc/testsuite/
>   PR libfortran/101305
>   * gfortran.dg/dg.exp: Add '-I /libgfortran'
>   compile flag.

Wouldn't it be better to do that in gcc/testsuite/lib/gfortran.exp
to GFORTRAN_UNDER_TEST there next to
-B$specpath/libgfortran/ ?
So that we don't add it for the installed gfortran testing - there
we want to test what installed gfortran will do,
and will affect also libgomp testing.

Jakub



Re: [PATCH] correct uninitialized object offset and size computation [PR101494]

2021-07-28 Thread Martin Sebor via Gcc-patches

On 7/23/21 10:39 AM, Jeff Law wrote:



On 7/22/2021 3:58 PM, Martin Sebor via Gcc-patches wrote:

The code that computes the size of an access to an object in
-Wuninitialized is limited to declared objects and so doesn't
apply to allocated objects, and doesn't correctly account for
an offset into the object and the access size.  This causes
false positives.

The attached fix tested on x86_64-linux corrects this.

Martin

gcc-101494.diff

Correct uninitialized object offset and size computation [PR101494].

Resolves:
PR middle-end/101494 - -uninitialized false alarm with memrchr of size 0

gcc/ChangeLog:

PR middle-end/101494
* tree-ssa-uninit.c (builtin_call_nomodifying_p):
(check_defs):
(maybe_warn_operand):

gcc/testsuite/ChangeLog:

PR middle-end/101494
* gcc.dg/uninit-38.c:
* gcc.dg/uninit-41.c: New test.
* gcc.dg/uninit-pr101494.c: New test.
OK once you complete the ChangeLog entry for the tree-ssa-uninit.c 
change.  Note this change only modifies maybe_warn_operand.


Whoops.  Fixed and pushed in r12-2583.

Martin


Re: [PATCH v2 6/6] rs6000: Add tests for SSE4.1 "floor" intrinsics

2021-07-28 Thread Segher Boessenkool
On Fri, Jul 16, 2021 at 08:50:22AM -0500, Paul A. Clarke wrote:
> gcc/testsuite
>   * gcc.target/powerpc/sse4_1-floorpd.c: New.
>   * gcc.target/powerpc/sse4_1-floorps.c: New.
>   * gcc.target/powerpc/sse4_1-floorsd.c: New.
>   * gcc.target/powerpc/sse4_1-floorss.c: New.
>   * gcc.target/powerpc/sse4_1-roundpd-2.c: Copy from
>   gcc/testsuite/gcc.target/i386.

Okido.  Thanks!


Segher


Re: [PATCH v2 5/6] rs6000: Add support for SSE4.1 "floor" intrinsics

2021-07-28 Thread Segher Boessenkool
On Fri, Jul 16, 2021 at 08:50:21AM -0500, Paul A. Clarke wrote:
>   * config/rs6000/smmintrin.h (_mm_floor_pd, _mm_floor_ps,
>   _mm_floor_sd, _mm_floor_ss): New.

Okay for trunk.  Thanks!


Segher


Re: [PATCH v2 4/6] rs6000: Add tests for SSE4.1 "ceil" intrinsics

2021-07-28 Thread Segher Boessenkool
Hi!

On Fri, Jul 16, 2021 at 08:50:20AM -0500, Paul A. Clarke wrote:
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/sse4_1-round.h
> @@ -0,0 +1,27 @@
> +#include 
> +#include 
> +#include "sse4_1-check.h"
> +
> +#define DIM(a) (sizeof (a) / sizeof ((a)[0]))

Pet peeve: sizeof is an operator, not a function, so even if you want to
protect the macro parameter this just is
  #define DIM(a) (sizeof (a) / sizeof (a)[0])

> +  (void) fesetround (round_save);

Please don't cast to (void).  That never does *anything*.

Okay for trunk (these are all testsuite files after all, and we should
test horrrible style as well! :-P )

Thanks,


Segher


Re: [PATCH] add access warning pass

2021-07-28 Thread Martin Sebor via Gcc-patches

On 7/28/21 3:23 AM, Richard Biener wrote:

On Fri, Jul 16, 2021 at 12:42 AM Martin Sebor via Gcc-patches
 wrote:


A number of access warnings as well as their supporting
infrastructure (compute_objsize et al.) are implemented in
builtins.{c,h} where they  (mostly) operate on trees and run
just before RTL expansion.

This setup may have made sense initially when the warnings were
very simple and didn't perform any CFG analysis, but it's becoming
a liability.  The code has grown both in size and in complexity,
might need to examine the CFG to improve detection, and in some
cases might achieve a better S/R ratio if run earlier.  Running
the warning code on trees is also slower because it doesn't
benefit from the SSA_NAME caching provided by the pointer_query
class.  Finally, having the code there is also an impediment to
maintainability as warnings and builtin expansion are unrelated
to each other and contributors to one area shouldn't need to wade
through unrelated code (similar for patch reviewers).

The attached change introduces a new warning pass and a couple of
new source and headers and, as the first step, moves the warning
code from builtins.{c,h} there.  To keep the initial changes as
simple as possible the pass only runs a subset of existing
warnings: -Wfree-nonheap-object, -Wmismatched-dealloc, and
-Wmismatched-new-delete.  The others (-Wstringop-overflow and
-Wstringop-overread) still run on the tree representation and
are still invoked from builtins.c or elsewhere.

The changes have no functional impact either on codegen or on
warnings.  I tested them on x86_64-linux.

As the next step I plan to change the -Wstringop-overflow and
-Wstringop-overread code to run on the GIMPLE IL in the new pass
instead of on trees in builtins.c.


That's the maybe_warn_rdwr_sizes thing?


Among others, yes.  It includes most buffer overflow and overread
warnings.



+  gimple *stmt = gsi_stmt (si);
+  if (!is_gimple_call (stmt))
+   continue;
+
+  check (as_a (stmt));


  if (gcall *call = dyn_cast  (gsi_stmt (si)))
check (call);

might be more C++-ish.


Sure, I can do that.



The patch looks OK - I skimmed it as mostly moving things
around plus adding a new pass.


Okay, thanks.  I have retested the updated change and pushed it in
r12-2581.  As a reminder, the git show output for builtins.c looks
considerably different from the patch I posted because of what I
mentioned below.

Martin



Thanks,
Richard.


Martin

PS The builtins.c diff produced by git diff was much bigger than
the changes justify.  It seems that the code removal somehow
confused it.  To make review easy I replaced it with a plain
unified diff of builtins.c that doesn't suffer from the problem.




Re: [PATCH v2 3/6] rs6000: Add support for SSE4.1 "ceil" intrinsics

2021-07-28 Thread Segher Boessenkool
Hi!

On Fri, Jul 16, 2021 at 08:50:19AM -0500, Paul A. Clarke wrote:
>   * config/rs6000/smmintrin.h (_mm_ceil_pd, _mm_ceil_ps,
>   _mm_ceil_sd, _mm_ceil_ss): New.

This is fine.  Thanks!


Segher


Re: [PATCH] c/101512 - fix missing address-taking in c_common_mark_addressable_vec

2021-07-28 Thread Joseph Myers
On Wed, 21 Jul 2021, Jakub Jelinek via Gcc-patches wrote:

> I wonder if instead when trying to wrap
> C_MAYBE_CONST_EXPR into a VIEW_CONVERT_EXPR we shouldn't be
> removing that C_MAYBE_CONST_EXPR and perhaps adding it around the
> VIEW_CONVERT_EXPR.  E.g. various routines in c/c-typeck.c like
> build_unary_op remember int_operands, remove_c_maybe_const_expr
> and at the end note_integer_operands.
> 
> If Joseph thinks it is ok to have C_MAYBE_CONST_EXPR inside of
> VCE, then the patch looks good to me.

There are specific cases when a C_MAYBE_CONST_EXPR mustn't appear inside 
another expression: any case where the inner expression is required to be 
fully folded (this implies nested C_MAYBE_CONST_EXPR aren't allowed) and 
any case where the expression might appear (possibly unevaluated) in an 
integer constant expression (any C_MAYBE_CONST_EXPR noting that needs to 
be at top level).

If the expressions involved here can never appear in an integer constant 
expression and do not need to be fully folded, I think it's OK to have 
C_MAYBE_CONST_EXPR inside VIEW_CONVERT_EXPR.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH 02/10] [i386] Enable _Float16 type for TARGET_SSE2 and above.

2021-07-28 Thread Joseph Myers
On Wed, 21 Jul 2021, liuhongt via Gcc-patches wrote:

> @@ -23254,13 +23337,15 @@ ix86_get_excess_precision (enum 
> excess_precision_type type)
>  provide would be identical were it not for the unpredictable
>  cases.  */
>   if (!TARGET_80387)
> -   return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT;
> +   return TARGET_SSE2
> +  ? FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16
> +  : FLT_EVAL_METHOD_PROMOTE_TO_FLOAT;
>   else if (!TARGET_MIX_SSE_I387)
> {
>   if (!(TARGET_SSE && TARGET_SSE_MATH))
> return FLT_EVAL_METHOD_PROMOTE_TO_LONG_DOUBLE;
>   else if (TARGET_SSE2)
> -   return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT;
> +   return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16;
> }
>  
>   /* If we are in standards compliant mode, but we know we will

This patch is not changing the default "fast" mode at all; that's 
promoting to float, unconditionally.  But you have a subsequent change 
there in patch 4 to make the promotions in the default "fast" mode depend 
on hardware support for the new instructions; it's unhelpful for the 
documentation not to corresponding exactly to the code changes in the same 
patch.

Rather than using FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 whenever TARGET_SSE2 
(i.e. whenever the type is available), it might make more sense to follow 
AArch64 and use it only when the hardware instructions are available.  In 
any case, it seems peculiar to use a different threshold in the "fast" 
case from the "standard" case.  -fexcess-precision=standard is not "avoid 
excess precision", it's "implement excess precision in the front end".  
Whenever "fast" is implementing excess precision in the front end, 
"standard" should be doing the same thing as "fast".

> +Soft-fp keeps the intermediate result of the operation at 32-bit precision 
> by defaults,
> +which may lead to inconsistent behavior between soft-fp and avx512fp16 
> instructions,
> +using @option{-fexcess-precision=standard} will force round back after every 
> operation.

"soft-fp" is, as the name of some code within GCC, an internal 
implementation detail, which should not be referenced in the user manual.  
What results in intermediate results being in a wider precision is not 
soft-fp; it's promotions inserted by the front end as a result of how the 
above hook is defined (promotions inserted by the optabs/expand code are 
an implementation detail that should always be followed automatically by a 
truncation of the result and so not be user-visible).

As far as I know, the official name of "avx512fp16" is "AVX512-FP16" and 
text in the manual should use the official capitalization, hyphenation 
etc. in such names unless literally referring to command-line options 
inside @option or similar.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [patch][version 6] add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-07-28 Thread Qing Zhao via Gcc-patches
Hi, Kees,

Thanks a lot for your testing and the small testing case.

I just studied the root cause of this bug, and found that it’s because the call 
to “__builtin_clear_padding()” should NOT be inserted BEFORE
the variable initialization. It should be inserted AFTER the variable 
initialization. 

Currently since the call to “__builtin_clear_padding()” is inserted Before the 
variable initialization like the following:

  __builtin_clear_padding (, 0B, 1);
  obj = {};
  obj.val = val;

Then as a result, the reference to “obj” in the call to 
“__builtin_clear_padding” is considered as an uninitialized usage.  
I will move the call to __builtin_clear_padding after the variable 
initialization. 

Thanks.

Qing

> On Jul 28, 2021, at 3:21 PM, Kees Cook  wrote:
> 
> On Tue, Jul 27, 2021 at 03:26:00AM +, Qing Zhao wrote:
>> This is the 6th version of the patch for the new security feature for GCC.
>> 
>> I have tested it with bootstrap on both x86 and aarch64, regression testing 
>> on both x86 and aarch64.
>> Also compile CPU2017 (running is ongoing), without any issue. (With the fix 
>> to bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101586).
>> 
>> Please take a look and let me know any issue.
> 
> Good news, this passes all my initialization tests in the kernel. Yay! :)
> 
> However, I see an unexpected side-effect from some static initializations:
> 
> net/core/sock.c: In function 'sock_no_sendpage':
> net/core/sock.c:2849:23: warning: 'msg' is used uninitialized 
> [-Wuninitialized]
> 2849 | struct msghdr msg = {.msg_flags = flags};
>  |   ^~~   
> 
> It seems like -Wuninitialized has suddenly stopped noticing explicit
> static initializers when there are bit fields in the struct. Here's a
> minimized case:
> 
> $ cat init.c
> struct weird {
>int bit : 1;
>int val;
> };
> 
> int func(int val)
> {
>struct weird obj = { .val = val };
>return obj.val;
> }
> 
> $ gcc -c -o init.o -Wall -O2 -ftrivial-auto-var-init=zero init.c
> init.c: In function ‘func’:
> init.c:8:22: warning: ‘obj’ is used uninitialized [-Wuninitialized]
>8 | struct weird obj = { .val = val };
>  |  ^~~
> init.c:8:22: note: ‘obj’ declared here
>8 | struct weird obj = { .val = val };
>  |  ^~~
> 
> 
> 
> -- 
> Kees Cook



Re: [PATCH v2 2/6] rs6000: Add tests for SSE4.1 "blend" intrinsics

2021-07-28 Thread Segher Boessenkool
Hi!

On Fri, Jul 16, 2021 at 08:50:18AM -0500, Paul A. Clarke wrote:
> Copy the tests for _mm_blend_pd, _mm_blendv_pd, _mm_blend_ps,
> _mm_blendv_ps from gcc/testsuite/gcc.target/i386.

You get less messy series in cases like this if you just put the tests
in the same patch as the code it tests (which works fine with Git by
default, it sorts everything in gcc/testsuite/ after everything in
gcc/config/ after all, so the important stuff is first in your patch).

> gcc/testsuite
>   * gcc.target/powerpc/sse4_1-blendpd.c: Copy from gcc.target/i386.
>   * gcc.target/powerpc/sse4_1-blendps-2.c: Likewise.
>   * gcc.target/powerpc/sse4_1-blendps.c: Likewise.
>   * gcc.target/powerpc/sse4_1-blendvpd.c: Likewise.

Well, they aren't exact copies, the dg-* statements are different (to
make it run only on a p8 or up, and enabling generating p8 code).  So
maybe say that?

Okay for trunk.  Thanks!


Segher


Re: [PATCH v2 1/6] rs6000: Add support for SSE4.1 "blend" intrinsics

2021-07-28 Thread Segher Boessenkool
Hi!

On Fri, Jul 16, 2021 at 08:50:17AM -0500, Paul A. Clarke wrote:
> _mm_blend_epi16 and _mm_blendv_epi8 were added earlier.
> Add these four to complete the set.
> 
> 2021-07-16  Paul A. Clarke  
> 
> gcc
>   * config/rs6000/smmintrin.h (_mm_blend_pd, _mm_blendv_pd,
>   _mm_blend_ps, _mm_blendv_ps): New.

I'm not sure if this is allowed like this in changelogs?  In either case
it is more obvious / aesthetically pleasing / etc. to write "gcc/".  But
also, it is fine to leave out this one, it being the default :-)

The patch is fiune for trunk.  Thank you!


Segher


Re: [PATCH 42/55] rs6000: Handle gimple folding of target built-ins

2021-07-28 Thread will schmidt via Gcc-patches
On Thu, 2021-06-17 at 10:19 -0500, Bill Schmidt via Gcc-patches wrote:


Hi,


> This is another patch that looks bigger than it really is.  Because we
> have a new namespace for the builtins, allowing us to have both the old
> and new builtin infrastructure supported at once, we need versions of
> these functions that use the new builtin namespace.  Otherwise the code is
> unchanged.

> 
> 2021-06-17  Bill Schmidt  
> 
> gcc/
>   * config/rs6000/rs6000-call.c (rs6000_gimple_fold_new_builtin):
>   New forward decl.
>   (rs6000_gimple_fold_builtin): Call rs6000_gimple_fold_new_builtin.
>   (rs6000_new_builtin_valid_without_lhs): New function.
>   (rs6000_gimple_fold_new_mma_builtin): Likewise.
>   (rs6000_gimple_fold_new_builtin): Likewise.

ok

> ---
>  gcc/config/rs6000/rs6000-call.c | 1152 +++
>  1 file changed, 1152 insertions(+)
> 
> diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
> index 269fddcdc7e..52df3d165e1 100644
> --- a/gcc/config/rs6000/rs6000-call.c
> +++ b/gcc/config/rs6000/rs6000-call.c
> @@ -190,6 +190,7 @@ static tree builtin_function_type (machine_mode, 
> machine_mode,
>  static void rs6000_common_init_builtins (void);
>  static void htm_init_builtins (void);
>  static void mma_init_builtins (void);
> +static bool rs6000_gimple_fold_new_builtin (gimple_stmt_iterator *gsi);
> 
> 
>  /* Hash table to keep track of the argument types for builtin functions.  */
> @@ -11992,6 +11993,9 @@ rs6000_gimple_fold_mma_builtin (gimple_stmt_iterator 
> *gsi)
>  bool
>  rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
>  {
> +  if (new_builtins_are_live)
> +return rs6000_gimple_fold_new_builtin (gsi);
> +
>gimple *stmt = gsi_stmt (*gsi);
>tree fndecl = gimple_call_fndecl (stmt);
>gcc_checking_assert (fndecl && DECL_BUILT_IN_CLASS (fndecl) == 
> BUILT_IN_MD);
> @@ -12939,6 +12943,35 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator 
> *gsi)
>return false;
>  }
> 
> +/*  Helper function to sort out which built-ins may be valid without having
> +a LHS.  */
> +static bool
> +rs6000_new_builtin_valid_without_lhs (enum rs6000_gen_builtins fn_code,
> +   tree fndecl)
> +{
> +  if (TREE_TYPE (TREE_TYPE (fndecl)) == void_type_node)
> +return true;
> +
> +  switch (fn_code)
> +{
> +case RS6000_BIF_STVX_V16QI:
> +case RS6000_BIF_STVX_V8HI:
> +case RS6000_BIF_STVX_V4SI:
> +case RS6000_BIF_STVX_V4SF:
> +case RS6000_BIF_STVX_V2DI:
> +case RS6000_BIF_STVX_V2DF:
> +case RS6000_BIF_STXVW4X_V16QI:
> +case RS6000_BIF_STXVW4X_V8HI:
> +case RS6000_BIF_STXVW4X_V4SF:
> +case RS6000_BIF_STXVW4X_V4SI:
> +case RS6000_BIF_STXVD2X_V2DF:
> +case RS6000_BIF_STXVD2X_V2DI:
> +  return true;
> +default:
> +  return false;
> +}
> +}

ok

> +
>  /* Check whether a builtin function is supported in this target
> configuration.  */
>  bool
> @@ -13030,6 +13063,1125 @@ rs6000_new_builtin_is_supported_p (enum 
> rs6000_gen_builtins fncode)
>return true;
>  }
> 
> +/* Expand the MMA built-ins early, so that we can convert the 
> pass-by-reference
> +   __vector_quad arguments into pass-by-value arguments, leading to more
> +   efficient code generation.  */
> +static bool
> +rs6000_gimple_fold_new_mma_builtin (gimple_stmt_iterator *gsi,
> + rs6000_gen_builtins fn_code)
> +{
> +  gimple *stmt = gsi_stmt (*gsi);
> +  size_t fncode = (size_t) fn_code;
> +
> +  if (!bif_is_mma (rs6000_builtin_info_x[fncode]))
> +return false;
> +
> +  /* Each call that can be gimple-expanded has an associated built-in
> + function that it will expand into.  If this one doesn't, we have
> + already expanded it!  */
> +  if (rs6000_builtin_info_x[fncode].assoc_bif == RS6000_BIF_NONE)
> +return false;
> +
> +  bifdata *bd = _builtin_info_x[fncode];
> +  unsigned nopnds = bd->nargs;
> +  gimple_seq new_seq = NULL;
> +  gimple *new_call;
> +  tree new_decl;
> +
> +  /* Compatibility built-ins; we used to call these
> + __builtin_mma_{dis,}assemble_pair, but now we call them
> + __builtin_vsx_{dis,}assemble_pair.  Handle the old verions.  */

versions.
(this snippet appears new to this version, so don't need to search for
an existing typo in current code. :-)

> +  if (fncode == RS6000_BIF_ASSEMBLE_PAIR)
> +fncode = RS6000_BIF_ASSEMBLE_PAIR_V;
> +  else if (fncode == RS6000_BIF_DISASSEMBLE_PAIR)
> +fncode = RS6000_BIF_DISASSEMBLE_PAIR_V;
> +
> +  if (fncode == RS6000_BIF_DISASSEMBLE_ACC
> +  || fncode == RS6000_BIF_DISASSEMBLE_PAIR_V)
> +{
> +  /* This is an MMA disassemble built-in function.  */
> +  push_gimplify_context (true);
> +  unsigned nvec = (fncode == RS6000_BIF_DISASSEMBLE_ACC) ? 4 : 2;
> +  tree dst_ptr = gimple_call_arg (stmt, 0);
> +  tree src_ptr = gimple_call_arg (stmt, 1);
> +  tree src_type = TREE_TYPE (src_ptr);
> +  tree src = 

Re: [PATCH][gcc] Allow functions without C-style ellipsis to use format attribute

2021-07-28 Thread Joseph Myers
On Mon, 19 Jul 2021, Martin Sebor via Gcc-patches wrote:

> You've answered my questions about the design (thank you) and I don't
> have any objections to the idea, but I'm not in a position to approve
> the patch.  I would suggest to get Jason's input on extending
> attribute format to variadic function templates, and Joseph's on
> extending it to ordinary (non-variadic) functions.  I've CC'd both.

One design question would be, if you allow this feature, whether it can be 
used in the case where there are no arguments to be formatted, with the 
corresponding attribute argument pointing just after the end of the 
function arguments.  My expectation would be that that case is valid (it 
could be used for a printf-like function that only substitutes no-argument 
formats such as %%), but if the argument number is any greater than that, 
there should be an error.  (And of course there would need to be test 
cases for that in the testsuite, both the valid case and the error for 
larger numbers, and the documentation of format attributes in extend.texi 
would need to describe such cases.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH 53/55] rs6000: Update altivec.h for automated interfaces

2021-07-28 Thread Bill Schmidt via Gcc-patches

Hi Will,

On 7/27/21 4:07 PM, will schmidt wrote:

On Thu, 2021-06-17 at 10:19 -0500, Bill Schmidt via Gcc-patches wrote:

2021-06-10  Bill Schmidt  

gcc/
* config/rs6000/altivec.h: Delete a number of #defines that are
now superfluous; include rs6000-vecdefines.h; include some
synonyms.
---
  gcc/config/rs6000/altivec.h | 522 +++---
--
  1 file changed, 41 insertions(+), 481 deletions(-)

diff --git a/gcc/config/rs6000/altivec.h
b/gcc/config/rs6000/altivec.h
index 5b631c7ebaf..8daf933e53e 100644
--- a/gcc/config/rs6000/altivec.h
+++ b/gcc/config/rs6000/altivec.h
@@ -55,32 +55,36 @@
  #define __CR6_LT  2
  #define __CR6_LT_REV  3

-/* Synonyms.  */
+#include "rs6000-vecdefines.h"
+
+/* Deprecated interfaces.  */
+#define vec_lvx vec_ld
+#define vec_lvxl vec_ldl
+#define vec_stvx vec_st
+#define vec_stvxl vec_stl
  #define vec_vaddcuw vec_addc
  #define vec_vand vec_and
  #define vec_vandc vec_andc
-#define vec_vrfip vec_ceil
  #define vec_vcmpbfp vec_cmpb
  #define vec_vcmpgefp vec_cmpge
  #define vec_vctsxs vec_cts
  #define vec_vctuxs vec_ctu
  #define vec_vexptefp vec_expte
-#define vec_vrfim vec_floor
-#define vec_lvx vec_ld
-#define vec_lvxl vec_ldl
  #define vec_vlogefp vec_loge
  #define vec_vmaddfp vec_madd
  #define vec_vmhaddshs vec_madds
-#define vec_vmladduhm vec_mladd
  #define vec_vmhraddshs vec_mradds
+#define vec_vmladduhm vec_mladd
  #define vec_vnmsubfp vec_nmsub
  #define vec_vnor vec_nor
  #define vec_vor vec_or
-#define vec_vpkpx vec_packpx
  #define vec_vperm vec_perm
-#define vec_permxor __builtin_vec_vpermxor
+#define vec_vpkpx vec_packpx
  #define vec_vrefp vec_re
+#define vec_vrfim vec_floor
  #define vec_vrfin vec_round
+#define vec_vrfip vec_ceil
+#define vec_vrfiz vec_trunc
  #define vec_vrsqrtefp vec_rsqrte
  #define vec_vsel vec_sel
  #define vec_vsldoi vec_sld
@@ -91,440 +95,56 @@
  #define vec_vspltisw vec_splat_s32
  #define vec_vsr vec_srl
  #define vec_vsro vec_sro
-#define vec_stvx vec_st
-#define vec_stvxl vec_stl
  #define vec_vsubcuw vec_subc
  #define vec_vsum2sws vec_sum2s
  #define vec_vsumsws vec_sums
-#define vec_vrfiz vec_trunc
  #define vec_vxor vec_xor

Appears to be rearranged/alphabetized.. OK.


+#ifdef _ARCH_PWR8
+#define vec_vclz vec_cntlz
+#define vec_vgbbd vec_gb
+#define vec_vmrgew vec_mergee
+#define vec_vmrgow vec_mergeo
+#define vec_vpopcntu vec_popcnt
+#define vec_vrld vec_rl
+#define vec_vsld vec_sl
+#define vec_vsrd vec_sr
+#define vec_vsrad vec_sra
+#endif


Does anything bad happen if these are simply defined, without the
#ifdef/#endif protection?
I'm wondering if there is some scenario with
pragma GCC target "cpu=powerX" where we may want them defined
anyway.



Yes, you're right about that.  We could run into such problems, I 
think.  I think it's best to always define these.  If the builtin isn't 
supported for the specific target configuration, it'll be flagged during 
the lookup process.


Good catch!  Thanks for the review!
Bill




Everything else appeears straightforward on this one, appears to be
mostly deletions.

lgtm,
thanks
-Will



+
+#ifdef _ARCH_PWR9
+#define vec_extract_fp_from_shorth vec_extract_fp32_from_shorth
+#define vec_extract_fp_from_shortl vec_extract_fp32_from_shortl
+#define vec_vctz vec_cnttz
+#endif
+
+/* Synonyms.  */
  /* Functions that are resolved by the backend to one of the
 typed builtins.  */
-#define vec_vaddfp __builtin_vec_vaddfp
-#define vec_addc __builtin_vec_addc
-#define vec_adde __builtin_vec_adde
-#define vec_addec __builtin_vec_addec
-#define vec_vaddsws __builtin_vec_vaddsws
-#define vec_vaddshs __builtin_vec_vaddshs
-#define vec_vaddsbs __builtin_vec_vaddsbs
-#define vec_vavgsw __builtin_vec_vavgsw
-#define vec_vavguw __builtin_vec_vavguw
-#define vec_vavgsh __builtin_vec_vavgsh
-#define vec_vavguh __builtin_vec_vavguh
-#define vec_vavgsb __builtin_vec_vavgsb
-#define vec_vavgub __builtin_vec_vavgub
-#define vec_ceil __builtin_vec_ceil
-#define vec_cmpb __builtin_vec_cmpb
-#define vec_vcmpeqfp __builtin_vec_vcmpeqfp
-#define vec_cmpge __builtin_vec_cmpge
-#define vec_vcmpgtfp __builtin_vec_vcmpgtfp
-#define vec_vcmpgtsw __builtin_vec_vcmpgtsw
-#define vec_vcmpgtuw __builtin_vec_vcmpgtuw
-#define vec_vcmpgtsh __builtin_vec_vcmpgtsh
-#define vec_vcmpgtuh __builtin_vec_vcmpgtuh
-#define vec_vcmpgtsb __builtin_vec_vcmpgtsb
-#define vec_vcmpgtub __builtin_vec_vcmpgtub
-#define vec_vcfsx __builtin_vec_vcfsx
-#define vec_vcfux __builtin_vec_vcfux
-#define vec_cts __builtin_vec_cts
-#define vec_ctu __builtin_vec_ctu
-#define vec_cpsgn __builtin_vec_copysign
-#define vec_double __builtin_vec_double
-#define vec_doublee __builtin_vec_doublee
-#define vec_doubleo __builtin_vec_doubleo
-#define vec_doublel __builtin_vec_doublel
-#define vec_doubleh __builtin_vec_doubleh
-#define vec_expte __builtin_vec_expte
-#define vec_float __builtin_vec_float
-#define vec_float2 __builtin_vec_float2
-#define vec_floate __builtin_vec_floate

Re: [PATCH] c++: Improve memory usage of subsumption [PR100828]

2021-07-28 Thread Jason Merrill via Gcc-patches

On 7/19/21 6:05 PM, Patrick Palka wrote:

Constraint subsumption is implemented in two steps.  The first step
computes the disjunctive (or conjunctive) normal form of one of the
constraints, and the second step verifies that each clause in the
decomposed form implies the other constraint.   Performing these two
steps separately is problematic because in the first step the
disjunctive normal form can be exponentially larger than the original
constraint, and by computing it ahead of time we'd have to keep all of
it in memory.

This patch fixes this exponential blowup in memory usage by interleaving
these two steps, so that as soon as we decompose one clause we check
implication for it.  In turn, memory usage during subsumption is now
worst case linear in the size of the constraints rather than
exponential, and so we can safely remove the hard limit of 16 clauses
without introducing runaway memory usage on some inputs.  (Note the
_time_ complexity of subsumption is still exponential in the worst case.)

In order for this to work we need formula::branch to prepend the copy
of the current clause directly after the current clause rather than
at the end of the list, so that we fully decompose a clause shortly
after creating it.  Otherwise we'd end up accumulating exponentially
many (partially decomposed) clauses in memory anyway.

Bootstrapped and regtested on x86_64-pc-linux-gnu, and also tested on
range-v3 and cmcstl2.  Does this look OK for trunk and perhaps 11?


OK for trunk.


PR c++/100828

gcc/cp/ChangeLog:

* logic.cc (formula::formula): Use emplace_back.
(formula::branch): Insert a copy of m_current in front of
m_current instead of at the end of the list.
(formula::erase): Define.
(decompose_formula): Remove.
(decompose_antecedents): Remove.
(decompose_consequents): Remove.
(derive_proofs): Remove.
(max_problem_size): Remove.
(diagnose_constraint_size): Remove.
(subsumes_constraints_nonnull): Rewrite directly in terms of
decompose_clause and derive_proof, interleaving decomposition
with implication checking.  Use formula::erase to free the
current clause before moving on to the next one.
---
  gcc/cp/logic.cc | 118 ++--
  1 file changed, 35 insertions(+), 83 deletions(-)

diff --git a/gcc/cp/logic.cc b/gcc/cp/logic.cc
index 142457e408a..3f872c11fe2 100644
--- a/gcc/cp/logic.cc
+++ b/gcc/cp/logic.cc
@@ -223,9 +223,7 @@ struct formula
  
formula (tree t)

{
-/* This should call emplace_back(). There's an extra copy being
-   invoked by using push_back().  */
-m_clauses.push_back (t);
+m_clauses.emplace_back (t);
  m_current = m_clauses.begin ();
}
  
@@ -248,8 +246,7 @@ struct formula

clause& branch ()
{
  gcc_assert (!done ());
-m_clauses.push_back (*m_current);
-return m_clauses.back ();
+return *m_clauses.insert (std::next (m_current), *m_current);
}
  
/* Returns the position of the current clause.  */

@@ -287,6 +284,14 @@ struct formula
  return m_clauses.end ();
}
  
+  /* Remove the specified clause.  */

+
+  void erase (iterator i)
+  {
+gcc_assert (i != m_current);
+m_clauses.erase (i);
+  }
+
std::list m_clauses; /* The list of clauses.  */
iterator m_current; /* The current clause.  */
  };
@@ -659,39 +664,6 @@ decompose_clause (formula& f, clause& c, rules r)
f.advance ();
  }
  
-/* Decompose the logical formula F according to the logical

-   rules determined by R.  The result is a formula containing
-   clauses that contain only atomic terms.  */
-
-void
-decompose_formula (formula& f, rules r)
-{
-  while (!f.done ())
-decompose_clause (f, *f.current (), r);
-}
-
-/* Fully decomposing T into a list of sequents, each comprised of
-   a list of atomic constraints, as if T were an antecedent.  */
-
-static formula
-decompose_antecedents (tree t)
-{
-  formula f (t);
-  decompose_formula (f, left);
-  return f;
-}
-
-/* Fully decomposing T into a list of sequents, each comprised of
-   a list of atomic constraints, as if T were a consequent.  */
-
-static formula
-decompose_consequents (tree t)
-{
-  formula f (t);
-  decompose_formula (f, right);
-  return f;
-}
-
  static bool derive_proof (clause&, tree, rules);
  
  /* Derive a proof of both operands of T.  */

@@ -744,28 +716,6 @@ derive_proof (clause& c, tree t, rules r)
}
  }
  
-/* Derive a proof of T from disjunctive clauses in F.  */

-
-static bool
-derive_proofs (formula& f, tree t, rules r)
-{
-  for (formula::iterator i = f.begin(); i != f.end(); ++i)
-if (!derive_proof (*i, t, r))
-  return false;
-  return true;
-}
-
-/* The largest number of clauses in CNF or DNF we accept as input
-   for subsumption. This an upper bound of 2^16 expressions.  */
-static int max_problem_size = 16;
-
-static inline bool
-diagnose_constraint_size (tree t)
-{
-  error_at 

[r12-2549 Regression] FAIL: gcc.target/i386/pr92658-sse4.c scan-assembler-times pmovzxwq 2 on Linux/x86_64

2021-07-28 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

872da9a6f664a06d73c987aa0cb2e5b830158a10 is the first bad commit
commit 872da9a6f664a06d73c987aa0cb2e5b830158a10
Author: liuhongt 
Date:   Fri Mar 26 10:56:47 2021 +0800

Add the member integer_to_sse to processor_cost as a cost simulation for 
movd/pinsrd. It will be used to calculate the cost of vec_construct.

caused

FAIL: gcc.target/i386/pr91446.c scan-assembler-times vmovdqa[^\n\r]*xmm[0-9] 2
FAIL: gcc.target/i386/pr92658-avx512bw-2.c scan-assembler-times pmovsxdq 2
FAIL: gcc.target/i386/pr92658-avx512bw.c scan-assembler-times pmovzxdq 2
FAIL: gcc.target/i386/pr92658-sse4-2.c scan-assembler-times pmovsxdq 2
FAIL: gcc.target/i386/pr92658-sse4-2.c scan-assembler-times pmovsxwq 2
FAIL: gcc.target/i386/pr92658-sse4.c scan-assembler-times pmovzxdq 2
FAIL: gcc.target/i386/pr92658-sse4.c scan-assembler-times pmovzxwq 2

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-2549/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr91446.c --target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr91446.c --target_board='unix{-m64\ 
-march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr92658-avx512bw-2.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr92658-avx512bw-2.c 
--target_board='unix{-m64\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr92658-avx512bw.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr92658-avx512bw.c 
--target_board='unix{-m64\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr92658-sse4-2.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr92658-sse4-2.c 
--target_board='unix{-m64\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr92658-sse4.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr92658-sse4.c 
--target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


[r12-2558 Regression] FAIL: gfortran.dg/guality/pr41558.f90 -Os line 7 s == 'foo' on Linux/x86_64

2021-07-28 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

0f95c6b2f7dae35ec8c9f211d63edc42baa1d2b7 is the first bad commit
commit 0f95c6b2f7dae35ec8c9f211d63edc42baa1d2b7
Author: Bin Cheng 
Date:   Wed Jul 28 17:44:35 2021 +0800

Don't skip prologue/epilogue when initializing alias.

caused

FAIL: gcc.dg/guality/drap.c   -Os  -DPREVENT_OPTIMIZATION  line 21 a == 5
FAIL: gcc.dg/guality/drap.c   -Os  -DPREVENT_OPTIMIZATION  line 22 b == 6
FAIL: gcc.dg/guality/pr43051-1.c   -Os  -DPREVENT_OPTIMIZATION  line 35 v == 1
FAIL: gcc.dg/guality/pr43051-1.c   -Os  -DPREVENT_OPTIMIZATION  line 36 e == 
[1]
FAIL: gcc.dg/guality/pr43051-1.c   -Os  -DPREVENT_OPTIMIZATION  line 40 v == 1
FAIL: gcc.dg/guality/pr43051-1.c   -Os  -DPREVENT_OPTIMIZATION  line 41 e == 
[1]
FAIL: gcc.dg/guality/pr43177.c   -Os  -DPREVENT_OPTIMIZATION  line 15 x == 7
FAIL: gcc.dg/guality/pr54519-3.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  -DPREVENT_OPTIMIZATION line 20 y == 25
FAIL: gcc.dg/guality/pr54519-3.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  -DPREVENT_OPTIMIZATION line 20 z == 6
FAIL: gcc.dg/guality/pr54519-3.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  -DPREVENT_OPTIMIZATION line 23 y == 117
FAIL: gcc.dg/guality/pr54519-3.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  -DPREVENT_OPTIMIZATION line 23 z == 8
FAIL: gcc.dg/guality/pr54519-3.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  -DPREVENT_OPTIMIZATION line 20 y == 25
FAIL: gcc.dg/guality/pr54519-3.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  -DPREVENT_OPTIMIZATION line 20 z == 6
FAIL: gcc.dg/guality/pr54519-3.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  -DPREVENT_OPTIMIZATION line 23 y == 117
FAIL: gcc.dg/guality/pr54519-3.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  -DPREVENT_OPTIMIZATION line 23 z == 8
FAIL: gcc.dg/guality/pr54519-3.c   -Os  -DPREVENT_OPTIMIZATION  line 20 y == 25
FAIL: gcc.dg/guality/pr54519-3.c   -Os  -DPREVENT_OPTIMIZATION  line 20 z == 6
FAIL: gcc.dg/guality/pr54519-3.c   -Os  -DPREVENT_OPTIMIZATION  line 23 y == 117
FAIL: gcc.dg/guality/pr54519-3.c   -Os  -DPREVENT_OPTIMIZATION  line 23 z == 8
FAIL: gcc.dg/guality/pr54519-4.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  -DPREVENT_OPTIMIZATION line 17 y == 25
FAIL: gcc.dg/guality/pr54519-4.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  -DPREVENT_OPTIMIZATION line 17 y == 25
FAIL: gcc.dg/guality/pr54519-4.c   -Os  -DPREVENT_OPTIMIZATION  line 17 y == 25
FAIL: gcc.dg/guality/pr54796.c   -O1  -DPREVENT_OPTIMIZATION  line 17 a == 5
FAIL: gcc.dg/guality/pr54796.c   -O1  -DPREVENT_OPTIMIZATION  line 17 b == 6
FAIL: gcc.dg/guality/pr54796.c   -O1  -DPREVENT_OPTIMIZATION  line 17 c == 5
FAIL: gcc.dg/guality/pr54796.c   -O2  -DPREVENT_OPTIMIZATION  line 17 a == 5
FAIL: gcc.dg/guality/pr54796.c   -O2  -DPREVENT_OPTIMIZATION  line 17 b == 6
FAIL: gcc.dg/guality/pr54796.c   -O2  -DPREVENT_OPTIMIZATION  line 17 c == 5
FAIL: gcc.dg/guality/pr54796.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  -DPREVENT_OPTIMIZATION line 17 a == 5
FAIL: gcc.dg/guality/pr54796.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  -DPREVENT_OPTIMIZATION line 17 b == 6
FAIL: gcc.dg/guality/pr54796.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  -DPREVENT_OPTIMIZATION line 17 c == 5
FAIL: gcc.dg/guality/pr54796.c   -O3 -g  -DPREVENT_OPTIMIZATION  line 17 a == 5
FAIL: gcc.dg/guality/pr54796.c   -O3 -g  -DPREVENT_OPTIMIZATION  line 17 b == 6
FAIL: gcc.dg/guality/pr54796.c   -O3 -g  -DPREVENT_OPTIMIZATION  line 17 c == 5
FAIL: gcc.dg/guality/pr54796.c   -Os  -DPREVENT_OPTIMIZATION  line 17 a == 5
FAIL: gcc.dg/guality/pr54796.c   -Os  -DPREVENT_OPTIMIZATION  line 17 b == 6
FAIL: gcc.dg/guality/pr54796.c   -Os  -DPREVENT_OPTIMIZATION  line 17 c == 5
FAIL: gcc.dg/guality/vla-1.c   -O1  -DPREVENT_OPTIMIZATION  line 24 i == 5
FAIL: gcc.dg/guality/vla-1.c   -O2  -DPREVENT_OPTIMIZATION  line 24 i == 5
FAIL: gcc.dg/guality/vla-1.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  -DPREVENT_OPTIMIZATION line 24 i == 5
FAIL: gcc.dg/guality/vla-1.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  -DPREVENT_OPTIMIZATION line 24 i == 5
FAIL: gcc.dg/guality/vla-1.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  -DPREVENT_OPTIMIZATION line 24 sizeof (a) == 17 * sizeof 
(short)
FAIL: gcc.dg/guality/vla-1.c   -O3 -g  -DPREVENT_OPTIMIZATION  line 24 i == 5
FAIL: gcc.dg/guality/vla-1.c   -Os  -DPREVENT_OPTIMIZATION  line 24 i == 5
FAIL: gfortran.dg/guality/pr41558.f90   -Os  line 7 s == 'foo'

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-2558/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="guality.exp=gcc.dg/guality/drap.c 

Re: [PATCH 49/55] rs6000: Builtin expansion, part 6

2021-07-28 Thread Bill Schmidt via Gcc-patches



On 7/27/21 4:07 PM, will schmidt wrote:

On Thu, 2021-06-17 at 10:19 -0500, Bill Schmidt via Gcc-patches wrote:

2021-03-24  Bill Schmidt  

gcc/
* config/rs6000/rs6000-call.c (new_htm_spr_num): New function.
(new_htm_expand_builtin): Implement.
(rs6000_expand_new_builtin): Handle 32-bit and endian cases.
---
  gcc/config/rs6000/rs6000-call.c | 202 
  1 file changed, 202 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index e1b685fb874..6b60f0852ef 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -15068,11 +15068,171 @@ new_mma_expand_builtin (tree exp, rtx target, 
insn_code icode,
return target;
  }

+/* Return the appropriate SPR number associated with the given builtin.  */
+static inline HOST_WIDE_INT
+new_htm_spr_num (enum rs6000_gen_builtins code)
+{
+  if (code == RS6000_BIF_GET_TFHAR
+  || code == RS6000_BIF_SET_TFHAR)
+return TFHAR_SPR;
+  else if (code == RS6000_BIF_GET_TFIAR
+  || code == RS6000_BIF_SET_TFIAR)
+return TFIAR_SPR;
+  else if (code == RS6000_BIF_GET_TEXASR
+  || code == RS6000_BIF_SET_TEXASR)
+return TEXASR_SPR;
+  gcc_assert (code == RS6000_BIF_GET_TEXASRU
+ || code == RS6000_BIF_SET_TEXASRU);
+  return TEXASRU_SPR;
+}

Ok,



+
  /* Expand the HTM builtin in EXP and store the result in TARGET.  */
  static rtx
  new_htm_expand_builtin (bifdata *bifaddr, rs6000_gen_builtins fcode,
tree exp, rtx target)
  {
+  tree fndecl = TREE_OPERAND (CALL_EXPR_FN (exp), 0);
+  bool nonvoid = TREE_TYPE (TREE_TYPE (fndecl)) != void_type_node;
+
+  if (!TARGET_POWERPC64
+  && (fcode == RS6000_BIF_TABORTDC
+ || fcode == RS6000_BIF_TABORTDCI))
+{
+  error ("builtin %qs is only valid in 64-bit mode", bifaddr->bifname);
+  return const0_rtx;
+}

ok


+
+  rtx op[MAX_HTM_OPERANDS], pat;
+  int nopnds = 0;
+  tree arg;
+  call_expr_arg_iterator iter;
+  insn_code icode = bifaddr->icode;
+  bool uses_spr = bif_is_htmspr (*bifaddr);
+  rtx cr = NULL_RTX;
+
+  if (uses_spr)
+icode = rs6000_htm_spr_icode (nonvoid);
+  const insn_operand_data *insn_op = _data[icode].operand[0];
+
+  if (nonvoid)
+{
+  machine_mode tmode = (uses_spr) ? insn_op->mode : E_SImode;
+  if (!target
+ || GET_MODE (target) != tmode
+ || (uses_spr && !(*insn_op->predicate) (target, tmode)))
+   target = gen_reg_rtx (tmode);
+  if (uses_spr)
+   op[nopnds++] = target;
+}
+
+  FOR_EACH_CALL_EXPR_ARG (arg, iter, exp)
+{
+  if (arg == error_mark_node || nopnds >= MAX_HTM_OPERANDS)
+   return const0_rtx;
+
+  insn_op = _data[icode].operand[nopnds];
+  op[nopnds] = expand_normal (arg);
+
+  if (!(*insn_op->predicate) (op[nopnds], insn_op->mode))
+   {
+ if (!strcmp (insn_op->constraint, "n"))
+   {
+ int arg_num = (nonvoid) ? nopnds : nopnds + 1;
+ if (!CONST_INT_P (op[nopnds]))
+   error ("argument %d must be an unsigned literal", arg_num);
+ else
+   error ("argument %d is an unsigned literal that is "
+  "out of range", arg_num);
+ return const0_rtx;
+   }
+ op[nopnds] = copy_to_mode_reg (insn_op->mode, op[nopnds]);
+   }
+
+  nopnds++;
+}
+
+  /* Handle the builtins for extended mnemonics.  These accept
+ no arguments, but map to builtins that take arguments.  */
+  switch (fcode)
+{
+case RS6000_BIF_TENDALL:  /* Alias for: tend. 1  */
+case RS6000_BIF_TRESUME:  /* Alias for: tsr. 1  */
+  op[nopnds++] = GEN_INT (1);
+  break;
+case RS6000_BIF_TSUSPEND: /* Alias for: tsr. 0  */
+  op[nopnds++] = GEN_INT (0);
+  break;
+default:
+  break;
+}

ok


+
+  /* If this builtin accesses SPRs, then pass in the appropriate
+ SPR number and SPR regno as the last two operands.  */
+  if (uses_spr)
+{
+  machine_mode mode = (TARGET_POWERPC64) ? DImode : SImode;
+  op[nopnds++] = gen_rtx_CONST_INT (mode, new_htm_spr_num (fcode));
+}
+  /* If this builtin accesses a CR, then pass in a scratch
+ CR as the last operand.  */
+  else if (bif_is_htmcr (*bifaddr))

Given this is an if/else, presumably there are no builtins that use
both a SPR and access a CR ?



Yes, that's right.  This is the same logic as used for HTM in the old 
builtins code.





+{
+  cr = gen_reg_rtx (CCmode);
+  op[nopnds++] = cr;
+}
+
+  switch (nopnds)
+{
+case 1:
+  pat = GEN_FCN (icode) (op[0]);
+  break;
+case 2:
+  pat = GEN_FCN (icode) (op[0], op[1]);
+  break;
+case 3:
+  pat = GEN_FCN (icode) (op[0], op[1], op[2]);
+  break;
+case 4:
+  pat = GEN_FCN (icode) (op[0], op[1], op[2], op[3]);
+  break;
+default:
+  gcc_unreachable ();
+}
+  if (!pat)
+return NULL_RTX;
+  

Re: [PATCH] c++: Accept C++11 attribute-definition [PR101582]

2021-07-28 Thread Jason Merrill via Gcc-patches

On 7/23/21 4:03 AM, Jakub Jelinek wrote:

Hi!

As the following testcase shows, we don't parse properly
C++11 attribute-declaration:
https://eel.is/c++draft/dcl.dcl#nt:attribute-declaration

cp_parser_toplevel_declaration just handles empty-declaration parsing
(with diagnostics for C++98)


This seems to be a bug: from the comments, 
cp_parser_toplevel_declaration is intended to only handle #pragma 
parsing, everything else should be in cp_parser_declaration.


As a result, we wrongly reject

extern "C" ;

So please move empty-declaration and attribute-declaration handling into 
cp_parser_declaration.



and otherwise calls cp_parser_declaration
which on it calls cp_parser_simple_declaration and rejects it with
"does not declare anything" permerror.

The following patch instead handles it in cp_parser_toplevel_declaration
by parsing the attributes (standard ones only, we've never supported
__attribute__((...)); at namespace scope, so I'm not sure we need to
introduce that), which for C++98 emits the needed diagnostics, and then
warning if there are any attributes that we throw away on the floor.

I'll need this later for OpenMP directives at namespace scope, e.g.
[[omp::directive (requires, atomic_default_mem_order(seq_cst))]];
should be valid at namespace scope (and many other directives).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-07-23  Jakub Jelinek  

PR c++/101582
* parser.c (cp_parser_skip_std_attribute_spec_seq): Add a forward
declaration.
(cp_parser_toplevel_declaration): Parse attribute-declaration.

* g++.dg/cpp0x/gen-attrs-45.C: Expect a warning about ignored
attributes instead of error.
* g++.dg/cpp0x/gen-attrs-75.C: New test.

--- gcc/cp/parser.c.jj  2021-07-22 17:47:26.025761491 +0200
+++ gcc/cp/parser.c 2021-07-22 19:09:28.487513184 +0200
@@ -2507,6 +2507,8 @@ static tree cp_parser_std_attribute_spec
(cp_parser *);
  static tree cp_parser_std_attribute_spec_seq
(cp_parser *);
+static size_t cp_parser_skip_std_attribute_spec_seq
+  (cp_parser *, size_t);
  static size_t cp_parser_skip_attributes_opt
(cp_parser *, size_t);
  static bool cp_parser_extension_opt
@@ -14547,6 +14549,20 @@ cp_parser_toplevel_declaration (cp_parse
if (cxx_dialect < cxx11)
pedwarn (input_location, OPT_Wpedantic, "extra %<;%>");
  }
+  else if (cp_lexer_nth_token_is (parser->lexer,
+ cp_parser_skip_std_attribute_spec_seq (parser,
+1),
+ CPP_SEMICOLON))
+{
+  location_t attrs_loc = token->location;
+  tree std_attrs = cp_parser_std_attribute_spec_seq (parser);
+  if (std_attrs != NULL_TREE)
+   warning_at (make_location (attrs_loc, attrs_loc, parser->lexer),
+   OPT_Wattributes,
+   "attributes in attribute declaration are ignored");
+  if (cp_lexer_next_token_is (parser->lexer, CPP_SEMICOLON))
+   cp_lexer_consume_token (parser->lexer);
+}
else
  /* Parse the declaration itself.  */
  cp_parser_declaration (parser, NULL_TREE);
--- gcc/testsuite/g++.dg/cpp0x/gen-attrs-45.C.jj2020-01-12 
11:54:37.072403466 +0100
+++ gcc/testsuite/g++.dg/cpp0x/gen-attrs-45.C   2021-07-22 19:14:38.250222344 
+0200
@@ -1,4 +1,4 @@
  // PR c++/52906
  // { dg-do compile { target c++11 } }
  
-[[gnu::deprecated]]; // { dg-error "does not declare anything" }

+[[gnu::deprecated]]; // { dg-warning "attributes in attribute declaration are 
ignored" }
--- gcc/testsuite/g++.dg/cpp0x/gen-attrs-75.C.jj2021-07-22 
19:14:58.438942693 +0200
+++ gcc/testsuite/g++.dg/cpp0x/gen-attrs-75.C   2021-07-22 19:12:18.442158972 
+0200
@@ -0,0 +1,8 @@
+// PR c++/101582
+// { dg-do compile }
+// { dg-options "" }
+
+;
+[[]] [[]] [[]];// { dg-warning "attributes only available with" "" { 
target c++98_only } }
+[[foobar]];// { dg-warning "attributes in attribute declaration are 
ignored" }
+// { dg-warning "attributes only available with" "" { target c++98_only } .-1 }

Jakub





Re: [patch][version 6] add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-07-28 Thread Kees Cook via Gcc-patches
On Tue, Jul 27, 2021 at 03:26:00AM +, Qing Zhao wrote:
> This is the 6th version of the patch for the new security feature for GCC.
> 
> I have tested it with bootstrap on both x86 and aarch64, regression testing 
> on both x86 and aarch64.
> Also compile CPU2017 (running is ongoing), without any issue. (With the fix 
> to bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101586).
> 
> Please take a look and let me know any issue.

Good news, this passes all my initialization tests in the kernel. Yay! :)

However, I see an unexpected side-effect from some static initializations:

net/core/sock.c: In function 'sock_no_sendpage':
net/core/sock.c:2849:23: warning: 'msg' is used uninitialized [-Wuninitialized]
 2849 | struct msghdr msg = {.msg_flags = flags};
  |   ^~~   

It seems like -Wuninitialized has suddenly stopped noticing explicit
static initializers when there are bit fields in the struct. Here's a
minimized case:

$ cat init.c
struct weird {
int bit : 1;
int val;
};

int func(int val)
{
struct weird obj = { .val = val };
return obj.val;
}

$ gcc -c -o init.o -Wall -O2 -ftrivial-auto-var-init=zero init.c
init.c: In function ‘func’:
init.c:8:22: warning: ‘obj’ is used uninitialized [-Wuninitialized]
8 | struct weird obj = { .val = val };
  |  ^~~
init.c:8:22: note: ‘obj’ declared here
8 | struct weird obj = { .val = val };
  |  ^~~



-- 
Kees Cook


New Swedish PO file for 'gcc' (version 11.2.0)

2021-07-28 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the Swedish team of translators.  The file is available at:

https://translationproject.org/latest/gcc/sv.po

(This file, 'gcc-11.2.0.sv.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

https://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

https://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




[committed] analyzer: play better with -fsanitize=bounds

2021-07-28 Thread David Malcolm via Gcc-patches
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as 37eb3ef48c9840475646528751b5f8ffb7eb34ce.

gcc/analyzer/ChangeLog:
* region-model.cc (region_model::on_call_pre): Treat
IFN_UBSAN_BOUNDS, BUILT_IN_STACK_SAVE, and BUILT_IN_STACK_RESTORE
as no-ops, rather than handling them as unknown functions.

gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/torture/ubsan-1.c: New test.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/region-model.cc  |  6 ++
 .../gcc.dg/analyzer/torture/ubsan-1.c | 60 +++
 2 files changed, 66 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/torture/ubsan-1.c

diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index 92fa917d14d..1bc411b2ed6 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -1082,6 +1082,8 @@ region_model::on_call_pre (const gcall *call, 
region_model_context *ctxt,
case IFN_BUILTIN_EXPECT:
 impl_call_builtin_expect (cd);
 return false;
+   case IFN_UBSAN_BOUNDS:
+return false;
}
 }
 
@@ -1137,6 +1139,10 @@ region_model::on_call_pre (const gcall *call, 
region_model_context *ctxt,
impl_call_strlen (cd);
return false;
 
+ case BUILT_IN_STACK_SAVE:
+ case BUILT_IN_STACK_RESTORE:
+   return false;
+
  /* Stdio builtins.  */
  case BUILT_IN_FPRINTF:
  case BUILT_IN_FPRINTF_UNLOCKED:
diff --git a/gcc/testsuite/gcc.dg/analyzer/torture/ubsan-1.c 
b/gcc/testsuite/gcc.dg/analyzer/torture/ubsan-1.c
new file mode 100644
index 000..b9f34f166ba
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/torture/ubsan-1.c
@@ -0,0 +1,60 @@
+/* { dg-skip-if "" { *-*-* } { "-fno-fat-lto-objects" } { "" } } */
+/* { dg-additional-options "-fsanitize=bounds" } */
+
+#include 
+#include "../analyzer-decls.h"
+
+int test_1 (int *arr, int i, int n)
+{
+  if (i >= n)
+return 0;
+  return arr[i];
+}
+
+int test_2 (int *arr, int i, int n)
+{
+  if (i >= n)
+return 0;
+  if (arr[i])
+__analyzer_eval (arr[i]); /* { dg-warning "TRUE" } */
+  else
+__analyzer_eval (arr[i]); /* { dg-warning "FALSE" } */
+}
+
+int test_3 (int arr[], int i, int n)
+{
+  if (i >= n)
+return 0;
+  if (arr[i])
+__analyzer_eval (arr[i]); /* { dg-warning "TRUE" } */
+  else
+__analyzer_eval (arr[i]); /* { dg-warning "FALSE" } */
+}
+
+void test_4 (int i, int n)
+{
+  int arr[n];
+  arr[i] = 42;
+  __analyzer_eval (arr[i] == 42); /* { dg-warning "TRUE" } */
+}
+
+void test_5 (int i, int n)
+{
+  int *arr = malloc (sizeof(int) * n);
+  if (arr)
+{
+  arr[i] = 42;
+  __analyzer_eval (arr[i] == 42); /* { dg-warning "TRUE" } */
+}
+  free (arr);
+}
+
+int global;
+
+void test_6 (int i, int n)
+{
+  int arr[n];
+  int saved = global;
+  arr[i] = 42;
+  __analyzer_eval (saved == global); /* { dg-warning "TRUE" } */
+}
-- 
2.26.3



[committed] analyzer: remove redundant return value from various impl_call_*

2021-07-28 Thread David Malcolm via Gcc-patches
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as b5081130166a4f2e363f116e0e6b43d83422c947.

gcc/analyzer/ChangeLog:
* region-model-impl-calls.cc (region_model::impl_call_alloca):
Drop redundant return value.
(region_model::impl_call_builtin_expect): Likewise.
(region_model::impl_call_calloc): Likewise.
(region_model::impl_call_malloc): Likewise.
(region_model::impl_call_memset): Likewise.
(region_model::impl_call_operator_new): Likewise.
(region_model::impl_call_operator_delete): Likewise.
(region_model::impl_call_strlen): Likewise.
* region-model.cc (region_model::on_call_pre): Fix return value of
known functions that don't have unknown side-effects.
* region-model.h (region_model::impl_call_alloca): Drop redundant
return value.
(region_model::impl_call_builtin_expect): Likewise.
(region_model::impl_call_calloc): Likewise.
(region_model::impl_call_malloc): Likewise.
(region_model::impl_call_memset): Likewise.
(region_model::impl_call_strlen): Likewise.
(region_model::impl_call_operator_new): Likewise.
(region_model::impl_call_operator_delete): Likewise.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/region-model-impl-calls.cc | 31 ++--
 gcc/analyzer/region-model.cc| 49 +
 gcc/analyzer/region-model.h | 16 
 3 files changed, 53 insertions(+), 43 deletions(-)

diff --git a/gcc/analyzer/region-model-impl-calls.cc 
b/gcc/analyzer/region-model-impl-calls.cc
index eff8caa8c0a..e5a6cb2e154 100644
--- a/gcc/analyzer/region-model-impl-calls.cc
+++ b/gcc/analyzer/region-model-impl-calls.cc
@@ -207,7 +207,7 @@ call_details::get_or_create_conjured_svalue (const region 
*reg) const
 
 /* Handle the on_call_pre part of "alloca".  */
 
-bool
+void
 region_model::impl_call_alloca (const call_details )
 {
   const svalue *size_sval = cd.get_arg_svalue (0);
@@ -215,7 +215,6 @@ region_model::impl_call_alloca (const call_details )
   const svalue *ptr_sval
 = m_mgr->get_ptr_svalue (cd.get_lhs_type (), new_reg);
   cd.maybe_set_lhs (ptr_sval);
-  return true;
 }
 
 /* Handle a call to "__analyzer_describe".
@@ -274,18 +273,17 @@ region_model::impl_call_analyzer_eval (const gcall *call,
 
 /* Handle the on_call_pre part of "__builtin_expect" etc.  */
 
-bool
+void
 region_model::impl_call_builtin_expect (const call_details )
 {
   /* __builtin_expect's return value is its initial argument.  */
   const svalue *sval = cd.get_arg_svalue (0);
   cd.maybe_set_lhs (sval);
-  return false;
 }
 
 /* Handle the on_call_pre part of "calloc".  */
 
-bool
+void
 region_model::impl_call_calloc (const call_details )
 {
   const svalue *nmemb_sval = cd.get_arg_svalue (0);
@@ -302,7 +300,6 @@ region_model::impl_call_calloc (const call_details )
= m_mgr->get_ptr_svalue (cd.get_lhs_type (), new_reg);
   cd.maybe_set_lhs (ptr_sval);
 }
-  return true;
 }
 
 /* Handle the on_call_pre part of "error" and "error_at_line" from
@@ -397,7 +394,7 @@ region_model::impl_call_free (const call_details )
 
 /* Handle the on_call_pre part of "malloc".  */
 
-bool
+void
 region_model::impl_call_malloc (const call_details )
 {
   const svalue *size_sval = cd.get_arg_svalue (0);
@@ -408,7 +405,6 @@ region_model::impl_call_malloc (const call_details )
= m_mgr->get_ptr_svalue (cd.get_lhs_type (), new_reg);
   cd.maybe_set_lhs (ptr_sval);
 }
-  return true;
 }
 
 /* Handle the on_call_pre part of "memcpy" and "__builtin_memcpy".  */
@@ -439,7 +435,7 @@ region_model::impl_call_memcpy (const call_details )
 
 /* Handle the on_call_pre part of "memset" and "__builtin_memset".  */
 
-bool
+void
 region_model::impl_call_memset (const call_details )
 {
   const svalue *dest_sval = cd.get_arg_svalue (0);
@@ -457,12 +453,11 @@ region_model::impl_call_memset (const call_details )
  num_bytes_sval);
   check_region_for_write (sized_dest_reg, cd.get_ctxt ());
   fill_region (sized_dest_reg, fill_value_u8);
-  return true;
 }
 
 /* Handle the on_call_pre part of "operator new".  */
 
-bool
+void
 region_model::impl_call_operator_new (const call_details )
 {
   const svalue *size_sval = cd.get_arg_svalue (0);
@@ -473,14 +468,13 @@ region_model::impl_call_operator_new (const call_details 
)
= m_mgr->get_ptr_svalue (cd.get_lhs_type (), new_reg);
   cd.maybe_set_lhs (ptr_sval);
 }
-  return false;
 }
 
 /* Handle the on_call_pre part of "operator delete", which comes in
both sized and unsized variants (2 arguments and 1 argument
respectively).  */
 
-bool
+void
 region_model::impl_call_operator_delete (const call_details )
 {
   const svalue *ptr_sval = cd.get_arg_svalue (0);
@@ -490,7 +484,6 @@ region_model::impl_call_operator_delete (const call_details 
)
 poisoning pointers.  */
   

Re: [PATCH 0/13] v2 warning control by group and location (PR 74765)

2021-07-28 Thread Martin Sebor via Gcc-patches

On 7/28/21 5:14 AM, Andrew Burgess wrote:

* Martin Sebor via Gcc-patches  [2021-07-19 09:08:35 
-0600]:


On 7/17/21 2:36 PM, Jan-Benedict Glaw wrote:

Hi Martin!

On Fri, 2021-06-04 15:27:04 -0600, Martin Sebor  wrote:

This is a revised patch series to add warning control by group and
location, updated based on feedback on the initial series.

[...]

My automated checking (in this case: Using Debian's "gcc-snapshot"
package) indicates that between versions 1:20210527-1 and
1:20210630-1, building GDB breaks. Your patch is a likely candidate.
It's a case where a method asks for a nonnull argument and later on
checks for NULLness again. The build log is currently available at
(http://wolf.lug-owl.de:8080/jobs/gdb-vax-linux/5), though obviously
breaks for any target:

configure --target=vax-linux --prefix=/tmp/gdb-vax-linux
make all-gdb

[...]
[all 2021-07-16 19:19:25]   CXXcompile/compile.o
[all 2021-07-16 19:19:30] In file included from 
./../gdbsupport/common-defs.h:126,
[all 2021-07-16 19:19:30]  from ./defs.h:28,
[all 2021-07-16 19:19:30]  from compile/compile.c:20:
[all 2021-07-16 19:19:30] ./../gdbsupport/gdb_unlinker.h: In constructor 
'gdb::unlinker::unlinker(const char*)':
[all 2021-07-16 19:19:30] ./../gdbsupport/gdb_assert.h:35:4: error: 'nonnull' 
argument 'filename' compared to NULL [-Werror=nonnull-compare]
[all 2021-07-16 19:19:30]35 |   ((void) ((expr) ? 0 :   
\
[all 2021-07-16 19:19:30]   |   
~^~~~
[all 2021-07-16 19:19:30]36 |(gdb_assert_fail (#expr, __FILE__, 
__LINE__, FUNCTION_NAME), 0)))
[all 2021-07-16 19:19:30]   |
~
[all 2021-07-16 19:19:30] ./../gdbsupport/gdb_unlinker.h:38:5: note: in 
expansion of macro 'gdb_assert'
[all 2021-07-16 19:19:30]38 | gdb_assert (filename != NULL);
[all 2021-07-16 19:19:30]   | ^~
[all 2021-07-16 19:19:31] cc1plus: all warnings being treated as errors
[all 2021-07-16 19:19:31] make[1]: *** [Makefile:1641: compile/compile.o] Error 
1
[all 2021-07-16 19:19:31] make[1]: Leaving directory 
'/var/lib/laminar/run/gdb-vax-linux/5/binutils-gdb/gdb'
[all 2021-07-16 19:19:31] make: *** [Makefile:11410: all-gdb] Error 2


Code is this:

   31 class unlinker
   32 {
   33  public:
   34
   35   unlinker (const char *filename) ATTRIBUTE_NONNULL (2)
   36 : m_filename (filename)
   37   {
   38 gdb_assert (filename != NULL);
   39   }

I'm quite undecided whether this is bad behavior of GCC or bad coding
style in Binutils/GDB, or both.


A warning should be expected in this case.  Before the recent GCC
change it was inadvertently suppressed in gdb_assert macros by its
operand being enclosed in parentheses.


This issue was just posted to the GDB list, and I wanted to clarify my
understanding a bit.

I believe that (at least by default) adding the nonnull attribute
allows GCC to assume (in the above case) that filename will not be
NULL and generate code accordingly.

Additionally, passing an explicit NULL (i.e. 'unlinker obj (NULL)')
would cause a compile time error.

But, there's nothing to actually stop a NULL being passed due to, say,
a logic bug in the program.  So, something like this would compile
fine:

   extern const char *ptr;
   unlinker obj (ptr);

And in a separate compilation unit:

   const char *ptr = NULL;

Obviously, the run time behaviour of such a program would be
undefined.

Given the above then, it doesn't seem crazy to want to do something
like the above, that is, add an assert to catch a logic bug in the
program.

Is there an approved mechanism through which I can tell GCC that I
really do want to do a comparison to NULL, without any warning, and
without the check being optimised out?


The manual says -fno-delete-null-pointer-checks is supposed to
prevent the removal of the null function argument test so I'd
expect adding attribute optimize ("no-delete-null-pointer-checks")
to the definition of the function to have that effect but in my
testing it didn't work (and didn't give a warning for the two
attributes on the same declarataion).  That seems worth filing
a bug for.

An alternate approach that does work is to remove the nonnull
attribute from the definition while leaving it on the declaration.
But the two have to be compiled separately for it to work, which
can be hard to do, especially with LTO.

The only other way I can think of is by playing tricks with volatile.
Even that only seems to work in non-obvious ways such as:

  __attribute__ ((nonnull)) void f (int *p)
  {
 if (!p) abort ();   // eliminated

 volatile int *q = p;
 if (!q) abort ();   // eliminated

 volatile int i = 0;
 if (![i]) abort ();   // works
  }

Martin


Re: [PATCH] IBM Z: Fix 5 tests in 31-bit mode

2021-07-28 Thread Andreas Krebbel via Gcc-patches
On 7/23/21 2:47 PM, Ilya Leoshkevich wrote:
> Bootstrapped and regtested on s390x-redhat-linux.  Ok for master?
> 
> 
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/global-array-element-pic2.c: Add -mzarch, add
>   an expectation for 31-bit mode.
>   * gcc.target/s390/load-imm64-1.c: Use unsigned long long.
>   * gcc.target/s390/load-imm64-2.c: Likewise.
>   * gcc.target/s390/vector/long-double-vx-macro-off-on.c: Use
>   -mzarch.
>   * gcc.target/s390/vector/long-double-vx-macro-on-off.c:
>   Likewise.

Ok. Thanks!

Andreas


[PATCH] tree-optimization/101615 - SLP permute opt with CTOR roots

2021-07-28 Thread Richard Biener
CTOR roots are not explicitely represented so we have to make sure
to materialize permutes on SLP graph entries to them.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2021-07-28  Richard Biener  

PR tree-optimization/101615
* tree-vect-slp.c (vect_optimize_slp): Materialize permutes
at CTOR SLP graph entries.

* gcc.dg/vect/bb-slp-pr101615-2.c: New testcase.
---
 gcc/testsuite/gcc.dg/vect/bb-slp-pr101615-2.c | 23 +++
 gcc/tree-vect-slp.c   | 12 ++
 2 files changed, 35 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/vect/bb-slp-pr101615-2.c

diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-pr101615-2.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-pr101615-2.c
new file mode 100644
index 000..ac89883de22
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-pr101615-2.c
@@ -0,0 +1,23 @@
+/* { dg-do run } */
+/* { dg-additional-options "-O3 -w -Wno-psabi" } */
+
+#include "tree-vect.h"
+
+int res[6] = { 5, 7, 11, 3, 3, 3 };
+int a[6] = {5, 5, 8};
+int c;
+
+int main()
+{
+  check_vect ();
+  for (int b = 0; b <= 4; b++)
+for (; c <= 4; c++) {
+   a[0] |= 1;
+   for (int e = 0; e <= 4; e++)
+ a[e + 1] |= 3;
+}
+  for (int d = 0; d < 6; d++)
+if (a[d] != res[d])
+  __builtin_abort ();
+  return 0;
+}
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 07cc24a60e1..a554c24e0fb 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -3715,6 +3715,18 @@ vect_optimize_slp (vec_info *vinfo)
   vertices[idx].perm_out = perms.length () - 1;
 }
 
+  /* In addition to the above we have to mark outgoing permutes facing
+ non-reduction graph entries that are not represented as to be
+ materialized.  */
+  for (slp_instance instance : vinfo->slp_instances)
+if (SLP_INSTANCE_KIND (instance) == slp_inst_kind_ctor)
+  {
+   /* Just setting perm_out isn't enough for the propagation to
+  pick this up.  */
+   vertices[SLP_INSTANCE_TREE (instance)->vertex].perm_in = 0;
+   vertices[SLP_INSTANCE_TREE (instance)->vertex].perm_out = 0;
+  }
+
   /* Propagate permutes along the graph and compute materialization points.  */
   bool changed;
   bool do_materialization = false;
-- 
2.26.2


[PATCH] aarch64: Add smov alternative to sign_extend pattern

2021-07-28 Thread Kyrylo Tkachov via Gcc-patches
Hi all,

In the testcase here we were generating a umov + sxth to move
a half-word value from SIMD to GP regs with sign-extension.
We can use a single smov instruction for it instead but the
sign-extend pattern was missing the right alternative.
The *zero_extend2_aarch64 pattern for
zero-extension already has the right alternative for
the analogous umov instruction, so this mirrors that pattern.

Bootstrapped and tested on aarch64-none-linux-gnu.

The test gcc.target/aarch64/sve/clastb_4.c is adjusted to scan for
the clastb  h0, p0, h0, z0.h form
instead of
the clastb  w0, p0, w0, z0.h form.

This is an improvement as the W forms of the clast instructions are more 
expensive.

Pushing to trunk.
Thanks,
Kyrill

gcc/ChangeLog:

* config/aarch64/aarch64.md (*extend2_aarch64):
Add "r,w" alternative.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/smov_1.c: New test.
* gcc.target/aarch64/sve/clastb_4.c: Adjust clast scan-assembler.


smov.patch
Description: smov.patch


Re: [PATCH 2/2] Backwards jump threader rewrite with ranger.

2021-07-28 Thread Martin Sebor via Gcc-patches

On 7/28/21 8:51 AM, Aldy Hernandez via Gcc-patches wrote:



On 7/28/21 4:32 PM, Jeff Law wrote:

...

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 863f1256811..0e205a41ac3 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -223,6 +223,11 @@ gimple-match.o-warn = -Wno-unused
  generic-match.o-warn = -Wno-unused
  dfp.o-warn = -Wno-strict-aliasing
+# maybe_emit_free_warning() is picking up the inlined location for the
+# warning, not the source of the original va_heap::release() function
+# which has a pragma disabling this warning.
+tree-ssa-loop-im.o-warn = -Wno-free-nonheap-object
I think some of Martin's work may help here, but I'm not sure if it's 
all gone in yet.  It might be worth syncing with him on the state of 
the improvements to how inlining and warnings interact.  If his work 
does fix the problem here, this hunk can be removed as a distinct 
follow-up.


Yes.  He definitely has some patches in this space that were likely to 
fix this.  I will re-test without this hunk, and remove it if it's fixed.


I think I tested your patch on top of mine while it was still in
review and confirmed that the workaround isn't necessary anymore.
But if something's changed/regressed since then please let me know.

Martin


Re: Question about divide by 0 and what we can do with it

2021-07-28 Thread Aldy Hernandez via Gcc-patches




On 7/28/21 4:39 PM, Andrew MacLeod wrote:

So Im seeing what appears to me to be inconsistent behaviour.

in pr96094.c we see:

int
foo (int x)
{
   if (x >= 2U)
     return 34;
   return 34 / x;
}

x has a range of [0,1] and since / 0  in undefined, the expectation is 
that we fold this to "return 34" and vrp1 does this:


   [local count: 767403281]:
   x_6 = ASSERT_EXPR ;
   _4 = 34 / x_6;

    [local count: 1073741824]:
   # _2 = PHI <34(2), _4(3)>

Transformed to

   [local count: 767403281]:
   _4 = 34;

    [local count: 1073741824]:
   # _2 = PHI <34(2), _4(3)>


but if we go to tree-ssa/pr61839_2.c, we see:

   volatile unsigned b = 1U;
   int c = 1;
   c = (a + 972195718) % (b ? 2 : 0);
   if (c == 1)
     ;
   else
     __builtin_abort ();

/* Dont optimize 972195717 / 0 in function foo.  */
/* { dg-final { scan-tree-dump-times "972195717 / " 1  "evrp" } } */


So why is it OK to optimize out the divide in the first case, but not in 
the second??


IIRC, the code triggering the removal of the first division by zero is 
the following in match.pd:


/* X / bool_range_Y is X.  */
 (simplify
  (div @0 SSA_NAME@1)
  (if (INTEGRAL_TYPE_P (type) && ssa_name_has_boolean_range (@1))
   @0))

Aldy



Re: [PATCH 1/2] Fix debug info for ignored decls at start of assembly

2021-07-28 Thread Bernd Edlinger
On 7/28/21 2:51 PM, Richard Biener wrote:
> On Mon, 26 Jul 2021, Bernd Edlinger wrote:
> 
>> Ignored functions decls that are compiled at the start of
>> the assembly have bogus line numbers until the first .file
>> directive, as reported in PR101575.
>>
>> The work around for this issue is to emit a dummy .file
>> directive when the first function is DECL_IGNORED_P, when
>> that is not already done, mostly for -fdwarf-4.
> 
> I wonder if it makes sense to unconditionally announce the
> TU with a .file directive at the beginning.  ISTR this is
> what we now do with -gdwarf-5.
> 

Yes, that would work, even when the file name is not guessed
correctly.

Initially I had "" unconditionally here, and it did
not really hurt, except that it is visible with readelf.

> Note get_AT_string (comp_unit_die (), DW_AT_name) doesn't
> work with LTO, you'll get  then.
> 

Yeah, that's why I wanted to restrict that to the case where
it's absolutely necessary.

> Is the dwarf assembler bug reported/fixed?  Can you include
> a reference please?
> 

I've just added a bug report, it's unlikely to be fixed IMHO:
https://sourceware.org/bugzilla/show_bug.cgi?id=28149

I will add that to the patch description:

Ignored functions decls that are compiled at the start of
the assembly have bogus line numbers until the first .file
directive, as reported in PR101575.

The corresponding binutils bug report is
https://sourceware.org/bugzilla/show_bug.cgi?id=28149

The work around for this issue is to emit a dummy .file
directive when the first function is DECL_IGNORED_P, when
that is not already done, mostly for -fdwarf-4.


Thanks
Bernd.

> Thanks,
> Richard.
> 
>> 2021-07-24  Bernd Edlinger  
>>
>>  PR ada/101575
>>  * dwarf2out.c (dwarf2out_begin_prologue): Move init
>>  of fde->ignored_debug to dwarf2out_set_ignored_loc.
>>  (dwarf2out_set_ignored_loc): This is now also called
>>  when no .loc statement is to be generated, in that case
>>  we emit a dummy .file statement when needed.
>>  * final.c (final_start_function_1,
>>  final_scan_insn_1): Call debug_hooks->set_ignored_loc
>>  for all DECL_IGNORED_P functions.
>> ---
>>  gcc/dwarf2out.c | 29 +
>>  gcc/final.c |  5 ++---
>>  2 files changed, 27 insertions(+), 7 deletions(-)
>>
>> diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
>> index 884f1e1..8de0d6f 100644
>> --- a/gcc/dwarf2out.c
>> +++ b/gcc/dwarf2out.c
>> @@ -1115,7 +1115,6 @@ dwarf2out_begin_prologue (unsigned int line 
>> ATTRIBUTE_UNUSED,
>>fde->dw_fde_current_label = dup_label;
>>fde->in_std_section = (fnsec == text_section
>>   || (cold_text_section && fnsec == cold_text_section));
>> -  fde->ignored_debug = DECL_IGNORED_P (current_function_decl);
>>in_text_section_p = fnsec == text_section;
>>  
>>/* We only want to output line number information for the genuine dwarf2
>> @@ -28546,10 +28545,32 @@ dwarf2out_set_ignored_loc (unsigned int line, 
>> unsigned int column,
>>  {
>>dw_fde_ref fde = cfun->fde;
>>  
>> -  fde->ignored_debug = false;
>> -  set_cur_line_info_table (function_section (fde->decl));
>> +  if (filename)
>> +{
>> +  set_cur_line_info_table (function_section (fde->decl));
>> +
>> +  dwarf2out_source_line (line, column, filename, 0, true);
>> +}
>> +  else
>> +{
>> +  fde->ignored_debug = true;
>> +
>> +  /* Work around for PR101575: output a dummy .file directive.  */
>> +  if (in_first_function_p
>> +  && debug_info_level >= DINFO_LEVEL_TERSE
>> +  && dwarf_debuginfo_p ()
>> +#if defined(HAVE_AS_GDWARF_5_DEBUG_FLAG) && 
>> defined(HAVE_AS_WORKING_DWARF_N_FLAG)
>> +  && dwarf_version <= 4
>> +#endif
>> +  && output_asm_line_debug_info ())
>> +{
>> +  const char *filename0 = get_AT_string (comp_unit_die (), DW_AT_name);
>>  
>> -  dwarf2out_source_line (line, column, filename, 0, true);
>> +  if (filename0 == NULL)
>> +filename0 = "";
>> +  maybe_emit_file (lookup_filename (filename0));
>> +}
>> +}
>>  }
>>  
>>  /* Record the beginning of a new source file.  */
>> diff --git a/gcc/final.c b/gcc/final.c
>> index ac6892d..82a5767 100644
>> --- a/gcc/final.c
>> +++ b/gcc/final.c
>> @@ -1725,7 +1725,7 @@ final_start_function_1 (rtx_insn **firstp, FILE *file, 
>> int *seen,
>>if (!dwarf2_debug_info_emitted_p (current_function_decl))
>>  dwarf2out_begin_prologue (0, 0, NULL);
>>  
>> -  if (DECL_IGNORED_P (current_function_decl) && last_linenum && 
>> last_filename)
>> +  if (DECL_IGNORED_P (current_function_decl))
>>  debug_hooks->set_ignored_loc (last_linenum, last_columnnum, 
>> last_filename);
>>  
>>  #ifdef LEAF_REG_REMAP
>> @@ -2205,8 +2205,7 @@ final_scan_insn_1 (rtx_insn *insn, FILE *file, int 
>> optimize_p ATTRIBUTE_UNUSED,
>>  }
>>else if (!DECL_IGNORED_P (current_function_decl))
>>  debug_hooks->switch_text_section ();
>> -  if (DECL_IGNORED_P (current_function_decl) && 

[PATCH][gcc/doc] Improve nonnull attribute documentation

2021-07-28 Thread Tom de Vries
Hi,

Improve nonnull attribute documentation in a number of ways:

Reorganize discussion of effects into:
- effects for calls to functions with nonnull-marked parameters, and
- effects for function definitions with nonnull-marked parameters.
This makes it clear that -fno-delete-null-pointer-checks has no effect for
optimizations based on nonnull-marked parameters in function definitions
(see PR100404).

Mention -Wnonnull-compare.

Mention workaround from PR100404 comment 7.

The workaround can be used for this scenario.  Say we have a test.c:
...
 #include 

 extern int isnull (char *ptr) __attribute__ ((nonnull));
 int isnull (char *ptr)
 {
   if (ptr == 0)
 return 1;
   return 0;
 }

 int
 main (void)
 {
   char *ptr = NULL;
   if (isnull (ptr)) __builtin_abort ();
   return 0;
 }
...

The test-case contains a mistake: ptr == NULL, and we want to detect the
mistake using an abort:
...
$ gcc test.c
$ ./a.out
Aborted (core dumped)
...

At -O2 however, the mistake is not detected:
...
$ gcc test.c -O2
$ ./a.out
...
which is what -Wnonnull-compare (not show here) warns about.

The easiest way to fix this is by dropping the nonnull attribute.  But that
also disables -Wnonnull, which would detect something like:
...
  if (isnull (NULL)) __builtin_abort ();
...
at compile time.

Using this workaround:
...
 int isnull (char *ptr)
 {
+  asm ("" : "+r"(ptr));
   if (ptr == 0)
 return 1;
   return 0;
 }
...
we still manage to detect the problem at runtime with -O2:
...
$ ~/gcc_versions/devel/install/bin/gcc test.c -O2
$ ./a.out
Aborted (core dumped)
...
while keeping the possibility to detect "isnull (NULL)" at compile time.

OK for trunk?

Thanks,
- Tom

[gcc/doc] Improve nonnull attribute documentation

gcc/ChangeLog:

2021-07-28  Tom de Vries  

* doc/extend.texi (nonnull attribute): Improve documentation.

---
 gcc/doc/extend.texi | 51 ---
 1 file changed, 40 insertions(+), 11 deletions(-)

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index b83cd4919bb..3389effd70c 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -3488,17 +3488,46 @@ my_memcpy (void *dest, const void *src, size_t len)
 @end smallexample
 
 @noindent
-causes the compiler to check that, in calls to @code{my_memcpy},
-arguments @var{dest} and @var{src} are non-null.  If the compiler
-determines that a null pointer is passed in an argument slot marked
-as non-null, and the @option{-Wnonnull} option is enabled, a warning
-is issued.  @xref{Warning Options}.  Unless disabled by
-the @option{-fno-delete-null-pointer-checks} option the compiler may
-also perform optimizations based on the knowledge that certain function
-arguments cannot be null. In addition,
-the @option{-fisolate-erroneous-paths-attribute} option can be specified
-to have GCC transform calls with null arguments to non-null functions
-into traps. @xref{Optimize Options}.
+informs the compiler that, in calls to @code{my_memcpy}, arguments
+@var{dest} and @var{src} must be non-null.
+
+The attribute has effect both for functions calls and function definitions.
+
+For function calls:
+@itemize @bullet
+@item If the compiler determines that a null pointer is
+passed in an argument slot marked as non-null, and the
+@option{-Wnonnull} option is enabled, a warning is issued.
+@xref{Warning Options}.
+@item The @option{-fisolate-erroneous-paths-attribute} option can be
+specified to have GCC transform calls with null arguments to non-null
+functions into traps.  @xref{Optimize Options}.
+@item The compiler may also perform optimizations based on the
+knowledge that certain function arguments cannot be null.  These
+optimizations can be disabled by the
+@option{-fno-delete-null-pointer-checks} option. @xref{Optimize Options}.
+@end itemize
+
+For function definitions:
+@itemize @bullet
+@item If the compiler determines that a function parameter that is
+marked with non-null is compared with null, and
+@option{-Wnonnull-compare} option is enabled, a warning is issued.
+@xref{Warning Options}.
+@item The compiler may also perform optimizations based on the
+knowledge that certain function parameters cannot be null.  This can
+be disabled by hiding the nonnullness using an inline assembly statement:
+
+@smallexample
+extern int isnull (char *ptr) __attribute__((nonnull));
+int isnull (char *ptr) @{
+  asm ("" : "+r"(ptr));
+  if (ptr == 0)
+return 1;
+  return 0;
+@}
+@end smallexample
+@end itemize
 
 If no @var{arg-index} is given to the @code{nonnull} attribute,
 all pointer arguments are marked as non-null.  To illustrate, the


Re: [PATCH 2/2] Backwards jump threader rewrite with ranger.

2021-07-28 Thread Aldy Hernandez via Gcc-patches




On 7/28/21 4:32 PM, Jeff Law wrote:



On 7/15/2021 8:57 AM, Aldy Hernandez wrote:

As mentioned in my previous email, these are some minor changes to the
previous revision.  All I'm changing here is the call into the solver
to use range_of_expr and range_of_stmt.  Everything else remains the
same.

Tested on x86-64 Linux.

On Mon, Jul 5, 2021 at 5:39 PM Aldy Hernandez  wrote:

PING.

Aldy

0003-Backwards-jump-threader-rewrite-with-ranger.patch

 From 1774338ddd1f4718884e766aae2fc48b97110c5d Mon Sep 17 00:00:00 2001
From: Aldy Hernandez
Date: Tue, 15 Jun 2021 12:32:51 +0200
Subject: [PATCH 3/5] Backwards jump threader rewrite with ranger.

This is a rewrite of the backwards threader with a ranger based solver.

The code is divided into two parts: the path solver in
gimple-range-path.*, and the path discovery bits in
tree-ssa-threadbackward.c.

The legacy code is still available with --param=threader-mode=legacy,
but will be removed shortly after.

gcc/ChangeLog:

* Makefile.in (tree-ssa-loop-im.o-warn): New.
* flag-types.h (enum threader_mode): New.
* params.opt: Add entry for --param=threader-mode.
* tree-ssa-threadbackward.c (THREADER_ITERATIVE_MODE): New.
(class back_threader): New.
(back_threader::back_threader): New.
(back_threader::~back_threader): New.
(back_threader::maybe_register_path): New.
(back_threader::find_taken_edge): New.
(back_threader::find_taken_edge_switch): New.
(back_threader::find_taken_edge_cond): New.
(back_threader::resolve_def): New.
(back_threader::resolve_phi): New.
(back_threader::find_paths_to_names): New.
(back_threader::find_paths): New.
(dump_path): New.
(debug): New.
(thread_jumps::find_jump_threads_backwards): Call ranger threader.
(thread_jumps::find_jump_threads_backwards_with_ranger): New.
(pass_thread_jumps::execute): Abstract out code...
(try_thread_blocks): ...here.
* tree-ssa-threadedge.c (jump_threader::thread_outgoing_edges):
Abstract out threading candidate code to...
(single_succ_to_potentially_threadable_block): ...here.
* tree-ssa-threadedge.h (single_succ_to_potentially_threadable_block):
New.
* tree-ssa-threadupdate.c (register_jump_thread): Return boolean.
* tree-ssa-threadupdate.h (class jump_thread_path_registry):
Return bool from register_jump_thread.

libgomp/ChangeLog:

* testsuite/libgomp.graphite/force-parallel-4.c: Adjust for
threader.
* testsuite/libgomp.graphite/force-parallel-8.c: Same.

gcc/testsuite/ChangeLog:

* g++.dg/debug/dwarf2/deallocator.C: Adjust for threader.
* gcc.c-torture/compile/pr83510.c: Same.
* gcc.dg/loop-unswitch-2.c: Same.
* gcc.dg/old-style-asm-1.c: Same.
* gcc.dg/pr68317.c: Same.
* gcc.dg/pr97567-2.c: Same.
* gcc.dg/predict-9.c: Same.
* gcc.dg/shrink-wrap-loop.c: Same.
* gcc.dg/sibcall-1.c: Same.
* gcc.dg/tree-ssa/builtin-sprintf-3.c: Same.
* gcc.dg/tree-ssa/pr21001.c: Same.
* gcc.dg/tree-ssa/pr21294.c: Same.
* gcc.dg/tree-ssa/pr21417.c: Same.
* gcc.dg/tree-ssa/pr21458-2.c: Same.
* gcc.dg/tree-ssa/pr21563.c: Same.
* gcc.dg/tree-ssa/pr49039.c: Same.
* gcc.dg/tree-ssa/pr61839_1.c: Same.
* gcc.dg/tree-ssa/pr61839_3.c: Same.
* gcc.dg/tree-ssa/pr77445-2.c: Same.
* gcc.dg/tree-ssa/split-path-4.c: Same.
* gcc.dg/tree-ssa/ssa-dom-thread-11.c: Same.
* gcc.dg/tree-ssa/ssa-dom-thread-12.c: Same.
* gcc.dg/tree-ssa/ssa-dom-thread-14.c: Same.
* gcc.dg/tree-ssa/ssa-dom-thread-18.c: Same.
* gcc.dg/tree-ssa/ssa-dom-thread-6.c: Same.
* gcc.dg/tree-ssa/ssa-dom-thread-7.c: Same.
* gcc.dg/tree-ssa/ssa-fre-48.c: Same.
* gcc.dg/tree-ssa/ssa-thread-11.c: Same.
* gcc.dg/tree-ssa/ssa-thread-12.c: Same.
* gcc.dg/tree-ssa/ssa-thread-14.c: Same.
* gcc.dg/tree-ssa/vrp02.c: Same.
* gcc.dg/tree-ssa/vrp03.c: Same.
* gcc.dg/tree-ssa/vrp05.c: Same.
* gcc.dg/tree-ssa/vrp06.c: Same.
* gcc.dg/tree-ssa/vrp07.c: Same.
* gcc.dg/tree-ssa/vrp09.c: Same.
* gcc.dg/tree-ssa/vrp19.c: Same.
* gcc.dg/tree-ssa/vrp20.c: Same.
* gcc.dg/tree-ssa/vrp33.c: Same.
* gcc.dg/uninit-pred-9_b.c: Same.
* gcc.dg/vect/bb-slp-16.c: Same.
* gcc.target/i386/avx2-vect-aggressive.c: Same.
* gcc.dg/tree-ssa/ranger-threader-1.c: New test.
* gcc.dg/tree-ssa/ranger-threader-2.c: New test.
* gcc.dg/tree-ssa/ranger-threader-3.c: New test.
* gcc.dg/tree-ssa/ranger-threader-4.c: New test.
* gcc.dg/tree-ssa/ranger-threader-5.c: New test.
---
  gcc/Makefile.in   |   5 +
  gcc/flag-types.h  |   7 +
  gcc/params.opt 

Question about divide by 0 and what we can do with it

2021-07-28 Thread Andrew MacLeod via Gcc-patches

So Im seeing what appears to me to be inconsistent behaviour.

in pr96094.c we see:

int
foo (int x)
{
  if (x >= 2U)
    return 34;
  return 34 / x;
}

x has a range of [0,1] and since / 0  in undefined, the expectation is 
that we fold this to "return 34" and vrp1 does this:


  [local count: 767403281]:
  x_6 = ASSERT_EXPR ;
  _4 = 34 / x_6;

   [local count: 1073741824]:
  # _2 = PHI <34(2), _4(3)>

Transformed to

  [local count: 767403281]:
  _4 = 34;

   [local count: 1073741824]:
  # _2 = PHI <34(2), _4(3)>


but if we go to tree-ssa/pr61839_2.c, we see:

  volatile unsigned b = 1U;
  int c = 1;
  c = (a + 972195718) % (b ? 2 : 0);
  if (c == 1)
    ;
  else
    __builtin_abort ();

/* Dont optimize 972195717 / 0 in function foo.  */
/* { dg-final { scan-tree-dump-times "972195717 / " 1  "evrp" } } */


So why is it OK to optimize out the divide in the first case, but not in 
the second??


Furthermore, If I tweak the second testcase to:

  int a = -1;
  volatile unsigned b = 1U;
  int c = 1;
  c = (a + 972195718) / (b ? 2 : 0);
  if (c == 486097858)
    ;
  else
    __builtin_abort ();

  int d = 1;
  d = (a + 972195718) / (b ? 1 : 0);
  if (d == 972195717)
    ;
  else
    __builtin_abort ();
  return d;

NOte the only difference is the first case divides by 0 or 2, the second 
case by 0 or 1...


we quite happily produce:

  :
  b ={v} 1;
  b.1_2 ={v} b;
  if (b.1_2 != 0)
    goto ; [INV]
  else
    goto ; [INV]

   :

   :
  # iftmp.0_7 = PHI <2(2), 0(3)>
  c_14 = 972195717 / iftmp.0_7;
  if (c_14 == 486097858)
    goto ; [INV]
  else
    goto ; [INV]

   :
  __builtin_abort ();

   :
  b.2_4 ={v} b;
  _5 = b.2_4 != 0;
  _6 = (int) _5;
  return 972195717;


Which has removed the second call to builtin_abort()    (Even before we 
get to EVRP!)


SO the issue doesn't seem to be removing the divide by 0, it seems to be 
a pattern match for [0,1] that is triggering.


I would argue the test case should not be testing for not removing the 
divide by 0... Because we can now fold c_14 to be 486097858, and I 
think that is a valid transformation?  (assuming no non-call exceptions 
of course)


Andrew



Re: [PATCH 2/2] Backwards jump threader rewrite with ranger.

2021-07-28 Thread Jeff Law via Gcc-patches




On 7/15/2021 8:57 AM, Aldy Hernandez wrote:

As mentioned in my previous email, these are some minor changes to the
previous revision.  All I'm changing here is the call into the solver
to use range_of_expr and range_of_stmt.  Everything else remains the
same.

Tested on x86-64 Linux.

On Mon, Jul 5, 2021 at 5:39 PM Aldy Hernandez  wrote:

PING.

Aldy

0003-Backwards-jump-threader-rewrite-with-ranger.patch

 From 1774338ddd1f4718884e766aae2fc48b97110c5d Mon Sep 17 00:00:00 2001
From: Aldy Hernandez 
Date: Tue, 15 Jun 2021 12:32:51 +0200
Subject: [PATCH 3/5] Backwards jump threader rewrite with ranger.

This is a rewrite of the backwards threader with a ranger based solver.

The code is divided into two parts: the path solver in
gimple-range-path.*, and the path discovery bits in
tree-ssa-threadbackward.c.

The legacy code is still available with --param=threader-mode=legacy,
but will be removed shortly after.

gcc/ChangeLog:

* Makefile.in (tree-ssa-loop-im.o-warn): New.
* flag-types.h (enum threader_mode): New.
* params.opt: Add entry for --param=threader-mode.
* tree-ssa-threadbackward.c (THREADER_ITERATIVE_MODE): New.
(class back_threader): New.
(back_threader::back_threader): New.
(back_threader::~back_threader): New.
(back_threader::maybe_register_path): New.
(back_threader::find_taken_edge): New.
(back_threader::find_taken_edge_switch): New.
(back_threader::find_taken_edge_cond): New.
(back_threader::resolve_def): New.
(back_threader::resolve_phi): New.
(back_threader::find_paths_to_names): New.
(back_threader::find_paths): New.
(dump_path): New.
(debug): New.
(thread_jumps::find_jump_threads_backwards): Call ranger threader.
(thread_jumps::find_jump_threads_backwards_with_ranger): New.
(pass_thread_jumps::execute): Abstract out code...
(try_thread_blocks): ...here.
* tree-ssa-threadedge.c (jump_threader::thread_outgoing_edges):
Abstract out threading candidate code to...
(single_succ_to_potentially_threadable_block): ...here.
* tree-ssa-threadedge.h (single_succ_to_potentially_threadable_block):
New.
* tree-ssa-threadupdate.c (register_jump_thread): Return boolean.
* tree-ssa-threadupdate.h (class jump_thread_path_registry):
Return bool from register_jump_thread.

libgomp/ChangeLog:

* testsuite/libgomp.graphite/force-parallel-4.c: Adjust for
threader.
* testsuite/libgomp.graphite/force-parallel-8.c: Same.

gcc/testsuite/ChangeLog:

* g++.dg/debug/dwarf2/deallocator.C: Adjust for threader.
* gcc.c-torture/compile/pr83510.c: Same.
* gcc.dg/loop-unswitch-2.c: Same.
* gcc.dg/old-style-asm-1.c: Same.
* gcc.dg/pr68317.c: Same.
* gcc.dg/pr97567-2.c: Same.
* gcc.dg/predict-9.c: Same.
* gcc.dg/shrink-wrap-loop.c: Same.
* gcc.dg/sibcall-1.c: Same.
* gcc.dg/tree-ssa/builtin-sprintf-3.c: Same.
* gcc.dg/tree-ssa/pr21001.c: Same.
* gcc.dg/tree-ssa/pr21294.c: Same.
* gcc.dg/tree-ssa/pr21417.c: Same.
* gcc.dg/tree-ssa/pr21458-2.c: Same.
* gcc.dg/tree-ssa/pr21563.c: Same.
* gcc.dg/tree-ssa/pr49039.c: Same.
* gcc.dg/tree-ssa/pr61839_1.c: Same.
* gcc.dg/tree-ssa/pr61839_3.c: Same.
* gcc.dg/tree-ssa/pr77445-2.c: Same.
* gcc.dg/tree-ssa/split-path-4.c: Same.
* gcc.dg/tree-ssa/ssa-dom-thread-11.c: Same.
* gcc.dg/tree-ssa/ssa-dom-thread-12.c: Same.
* gcc.dg/tree-ssa/ssa-dom-thread-14.c: Same.
* gcc.dg/tree-ssa/ssa-dom-thread-18.c: Same.
* gcc.dg/tree-ssa/ssa-dom-thread-6.c: Same.
* gcc.dg/tree-ssa/ssa-dom-thread-7.c: Same.
* gcc.dg/tree-ssa/ssa-fre-48.c: Same.
* gcc.dg/tree-ssa/ssa-thread-11.c: Same.
* gcc.dg/tree-ssa/ssa-thread-12.c: Same.
* gcc.dg/tree-ssa/ssa-thread-14.c: Same.
* gcc.dg/tree-ssa/vrp02.c: Same.
* gcc.dg/tree-ssa/vrp03.c: Same.
* gcc.dg/tree-ssa/vrp05.c: Same.
* gcc.dg/tree-ssa/vrp06.c: Same.
* gcc.dg/tree-ssa/vrp07.c: Same.
* gcc.dg/tree-ssa/vrp09.c: Same.
* gcc.dg/tree-ssa/vrp19.c: Same.
* gcc.dg/tree-ssa/vrp20.c: Same.
* gcc.dg/tree-ssa/vrp33.c: Same.
* gcc.dg/uninit-pred-9_b.c: Same.
* gcc.dg/vect/bb-slp-16.c: Same.
* gcc.target/i386/avx2-vect-aggressive.c: Same.
* gcc.dg/tree-ssa/ranger-threader-1.c: New test.
* gcc.dg/tree-ssa/ranger-threader-2.c: New test.
* gcc.dg/tree-ssa/ranger-threader-3.c: New test.
* gcc.dg/tree-ssa/ranger-threader-4.c: New test.
* gcc.dg/tree-ssa/ranger-threader-5.c: New test.
---
  gcc/Makefile.in   |   5 +
  gcc/flag-types.h  |   7 +
  gcc/params.opt|  17 +
  

Re: [PATCH] correct formatting of function pointers in -Warray-bounds (PR 101601)

2021-07-28 Thread Jeff Law via Gcc-patches




On 7/27/2021 2:45 PM, Martin Sebor via Gcc-patches wrote:

When mentioning the type of the accessed object -Warray-bounds
treats singleton objects as arrays of one element for simplicity.
But because the code doesn't distinguish between function and
object pointers, a warning for an out-of-bounds index into
a singleton function pointer object attempts to create an array
of one function.  Since arrays of functions are invalid,
the helper function the code calls fails with an error issued
to the user.

To avoid this the attached patch avoids this singleton-to-array
shortcut for function pointers.  Tested on x86_64-linux.

Martin


gcc-101601.diff

PR middle-end/101601 - [12 Regression] -Warray-bounds triggers error: arrays of 
functions are not meaningful

PR middle-end/101601

gcc/ChangeLog:

* gimple-array-bounds.cc (array_bounds_checker::check_mem_ref): Remove
a pointless test.
Handle pointers to functions.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Warray-bounds-25.C: New test.
* gcc.dg/Warray-bounds-85.c: New test.

OK
jeff



Re: [PATCH] incorrect arguments designated in -Wnonnull for arrays

2021-07-28 Thread Jeff Law via Gcc-patches




On 7/28/2021 12:56 AM, Uecker, Martin wrote:

Am Dienstag, den 27.07.2021, 10:55 -0600 schrieb Martin Sebor:

On 7/26/21 12:22 PM, Jeff Law via Gcc-patches wrote:

On 7/25/2021 10:23 AM, Uecker, Martin wrote:

Two arguments are switched for -Wnonnull when
warning about array parameters with bounds > 0
and which are NULL.

This patch corrects the mistake.

Martin


2021-07-25  Martin Uecker  

gcc/
   * calls.c (maybe_warn_rdwr_sizes): Correct argument
   numbers in warning that were switched.

gcc/testsuite/
   * gcc.dg/Wnonnull-4.c: Correct argument numbers in warnings.

I'll defer to Martin Sebor on this.

Martin S., can you cover the review of this patch from Martin U?

The patch is correct.  Thanks for the fix!  It would ideally go
into GCC 11 as well.

Committed.

Should I also push it to origin/releases/gcc-11 ?

Yes, the branch should be open for bugfixes.
jeff


[PATCH] aarch64: Don't include vec_select high-half in SIMD multiply cost

2021-07-28 Thread Jonathan Wright via Gcc-patches
Hi,

The Neon multiply/multiply-accumulate/multiply-subtract instructions
can select the top or bottom half of the operand registers. This
selection does not change the cost of the underlying instruction and
this should be reflected by the RTL cost function.

This patch adds RTL tree traversal in the Neon multiply cost function
to match vec_select high-half of its operands. This traversal
prevents the cost of the vec_select from being added into the cost of
the multiply - meaning that these instructions can now be emitted in
the combine pass as they are no longer deemed prohibitively
expensive.

Regression tested and bootstrapped on aarch64-none-linux-gnu - no
issues.

Ok for master?

Thanks,
Jonathan

---

gcc/ChangeLog:

2021-07-19  Jonathan Wright  

* config/aarch64/aarch64.c (aarch64_vec_select_high_operand_p):
Define.
(aarch64_rtx_mult_cost): Traverse RTL tree to prevent cost of
vec_select high-half from being added into Neon multiply
cost.
* rtlanal.c (vec_series_highpart_p): Define.
* rtlanal.h (vec_series_highpart_p): Declare.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/vmul_high_cost.c: New test.


rb14704.patch
Description: rb14704.patch


Re: [PATCH V2] aarch64: Don't include vec_select in SIMD multiply cost

2021-07-28 Thread Jonathan Wright via Gcc-patches
Hi,

V2 of the patch addresses the initial review comments, factors out
common code (as we discussed off-list) and adds a set of unit tests
to verify the code generation benefit.

Regression tested and bootstrapped on aarch64-none-linux-gnu - no
issues.

Ok for master?

Thanks,
Jonathan

---

gcc/ChangeLog:

2021-07-19  Jonathan Wright  

* config/aarch64/aarch64.c (aarch64_strip_duplicate_vec_elt):
Define.
(aarch64_rtx_mult_cost): Traverse RTL tree to prevent
vec_select cost from being added into Neon multiply cost.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/vmul_element_cost.c: New test.



From: Richard Sandiford 
Sent: 22 July 2021 18:16
To: Jonathan Wright 
Cc: gcc-patches@gcc.gnu.org ; Kyrylo Tkachov 

Subject: Re: [PATCH] aarch64: Don't include vec_select in SIMD multiply cost 
 
Jonathan Wright  writes:
> Hi,
>
> The Neon multiply/multiply-accumulate/multiply-subtract instructions
> can take various forms - multiplying full vector registers of values
> or multiplying one vector by a single element of another. Regardless
> of the form used, these instructions have the same cost, and this
> should be reflected by the RTL cost function.
>
> This patch adds RTL tree traversal in the Neon multiply cost function
> to match the vec_select used by the lane-referencing forms of the
> instructions already mentioned. This traversal prevents the cost of
> the vec_select from being added into the cost of the multiply -
> meaning that these instructions can now be emitted in the combine
> pass as they are no longer deemed prohibitively expensive.
>
> Regression tested and bootstrapped on aarch64-none-linux-gnu - no
> issues.
>
> Ok for master?
>
> Thanks,
> Jonathan
>
> ---
>
> gcc/ChangeLog:
>
> 2021-07-19  Jonathan Wright  
>
> * config/aarch64/aarch64.c (aarch64_rtx_mult_cost): Traverse
> RTL tree to prevents vec_select from being added into Neon
> multiply cost.
>
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 
> f5b25a7f7041645921e6ad85714efda73b993492..b368303b0e699229266e6d008e28179c496bf8cd
>  100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -11985,6 +11985,21 @@ aarch64_rtx_mult_cost (rtx x, enum rtx_code code, 
> int outer, bool speed)
>    op0 = XEXP (op0, 0);
>  else if (GET_CODE (op1) == VEC_DUPLICATE)
>    op1 = XEXP (op1, 0);
> +   /* The same argument applies to the VEC_SELECT when using the lane-
> +  referencing forms of the MUL/MLA/MLS instructions. Without the
> +  traversal here, the combine pass deems these patterns too
> +  expensive and subsequently does not emit the lane-referencing
> +  forms of the instructions. In addition, canonical form is for the
> +  VEC_SELECT to be the second argument of the multiply - thus only
> +  op1 is traversed.  */
> +   if (GET_CODE (op1) == VEC_SELECT
> +   && GET_MODE_NUNITS (GET_MODE (op1)).to_constant () == 1)
> + op1 = XEXP (op1, 0);
> +   else if ((GET_CODE (op1) == ZERO_EXTEND
> + || GET_CODE (op1) == SIGN_EXTEND)
> +    && GET_CODE (XEXP (op1, 0)) == VEC_SELECT
> +    && GET_MODE_NUNITS (GET_MODE (op1)).to_constant () == 1)
> + op1 = XEXP (XEXP (op1, 0), 0);

I think this logically belongs in the “GET_CODE (op1) == VEC_DUPLICATE”
if block, since the condition is never true otherwise.  We can probably
skip the GET_MODE_NUNITS tests, but if you'd prefer to keep them, I think
it would be better to add them to the existing VEC_DUPLICATE tests rather
than restrict them to the VEC_SELECT ones.

Also, although this is in Advanced SIMD-specific code, I think it'd be
better to use:

  is_a (GET_MODE (op1))

instead of:

  GET_MODE_NUNITS (GET_MODE (op1)).to_constant () == 1

Do you have a testcase?

Thanks,
Richard

rb14675.patch
Description: rb14675.patch


[PATCH] tree-optimization/101615 - SLP permute opt of existing vectors

2021-07-28 Thread Richard Biener
This fixes one issue discovered when analyzing PR101615, namely
we happily push permutes to pre-existing vectors but end up
not actually permuting them.  In fact we don't want to, so force
materialization on the external.

It doesn't fix the original testcase though.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2021-07-28  Richard Biener  

PR tree-optimization/101615
* tree-vect-slp.c (vect_optimize_slp): Pre-existing vector
external nodes cannot be permuted so make them perm_out 0.

* gcc.dg/vect/bb-slp-pr101615-1.c: New testcase.
---
 gcc/testsuite/gcc.dg/vect/bb-slp-pr101615-1.c | 30 +++
 gcc/tree-vect-slp.c   |  6 ++--
 2 files changed, 34 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/bb-slp-pr101615-1.c

diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-pr101615-1.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-pr101615-1.c
new file mode 100644
index 000..d1c9c02d517
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-pr101615-1.c
@@ -0,0 +1,30 @@
+/* { dg-do run } */
+/* { dg-additional-options "-w -Wno-psabi" } */
+
+#include "tree-vect.h"
+
+typedef int v4si __attribute__((vector_size(16)));
+
+int a[4];
+int b[4];
+
+void __attribute__((noipa))
+foo (v4si x)
+{
+  b[0] = a[3] + x[0];
+  b[1] = a[2] + x[1];
+  b[2] = a[1] + x[2];
+  b[3] = a[0] + x[3];
+}
+
+int main()
+{
+  check_vect ();
+  for (int i = 0; i < 4; ++i)
+a[i] = i;
+  v4si x = (v4si) { 8, 6, 4, 2 };
+  foo (x);
+  if (b[0] != 11 || b[1] != 8 || b[2] != 5 || b[3] != 2)
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index b9d88c2d943..07cc24a60e1 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -3648,8 +3648,10 @@ vect_optimize_slp (vec_info *vinfo)
   slp_tree node = vertices[idx].node;
 
   /* Handle externals and constants optimistically throughout the
-iteration.  */
-  if (SLP_TREE_DEF_TYPE (node) == vect_external_def
+iteration.  But treat existing vectors as fixed since we
+do not handle permuting them below.  */
+  if ((SLP_TREE_DEF_TYPE (node) == vect_external_def
+  && !SLP_TREE_VEC_DEFS (node).exists ())
  || SLP_TREE_DEF_TYPE (node) == vect_constant_def)
continue;
 
-- 
2.26.2


[committed] amdgcn: Fix attributes for LLVM-12 [PR 100208]

2021-07-28 Thread Andrew Stubbs
This patch follows up my previous patch and supports more variants of 
LLVM 12.


There are still other incompatibilities with LLVM 12, but this at least 
the ELF attributes should now automatically tune to any LLVM 9, 10, or 
12 assembler (It would be nice if one set of options would just work 
everywhere, but no).


LLVM 11 was not tested, but is broken in other ways in any case. LLVM 13 
(dev) needs more work.


Unfortunately, the need for configure tests and the CLI instability 
within the LLVM 12 release branch means that GCC probably needs to be 
rebuilt whenever LLVM is upgraded, even for minor versions.


Andrew
amdgcn: Fix attributes for LLVM-12 [PR 100208]

This should work for a wider range of LLVM 12 variants now.
More work required for LLVM 13 though.

gcc/ChangeLog:

PR target/100208
* config.in: Regenerate.
* config/gcn/gcn-hsa.h (A_FIJI): New define.
(A_900): New define.
(A_906): New define.
(A_908): New define.
(ASM_SPEC): Use A_FIJI, A_900, A_906 and A_908.
* config/gcn/gcn.c (output_file_start): Adjust attributes according
to the assembler capabilities.
* config/gcn/mkoffload.c (main): Likewise.
* configure: Regenerate.
* configure.ac: Add tests for LLVM assembler attribute features.

diff --git a/gcc/config.in b/gcc/config.in
index 2abac530c64..affaff2e33c 100644
--- a/gcc/config.in
+++ b/gcc/config.in
@@ -1449,6 +1449,30 @@
 #endif
 
 
+/* Define if your assembler allows -mattr=+sram-ecc for fiji. */
+#ifndef USED_FOR_TARGET
+#undef HAVE_GCN_SRAM_ECC_FIJI
+#endif
+
+
+/* Define if your assembler allows -mattr=+sram-ecc for gfx900. */
+#ifndef USED_FOR_TARGET
+#undef HAVE_GCN_SRAM_ECC_GFX900
+#endif
+
+
+/* Define if your assembler allows -mattr=+sram-ecc for gfx906. */
+#ifndef USED_FOR_TARGET
+#undef HAVE_GCN_SRAM_ECC_GFX906
+#endif
+
+
+/* Define if your assembler allows -mattr=+sram-ecc for gfx908. */
+#ifndef USED_FOR_TARGET
+#undef HAVE_GCN_SRAM_ECC_GFX908
+#endif
+
+
 /* Define to 1 if you have the `getchar_unlocked' function. */
 #ifndef USED_FOR_TARGET
 #undef HAVE_GETCHAR_UNLOCKED
diff --git a/gcc/config/gcn/gcn-hsa.h b/gcc/config/gcn/gcn-hsa.h
index 724e9a381ba..fc99c8db752 100644
--- a/gcc/config/gcn/gcn-hsa.h
+++ b/gcc/config/gcn/gcn-hsa.h
@@ -75,6 +75,28 @@ extern unsigned int gcn_local_sym_hash (const char *name);
supported for gcn.  */
 #define GOMP_SELF_SPECS ""
 
+#ifdef HAVE_GCN_SRAM_ECC_FIJI
+#define A_FIJI
+#else
+#define A_FIJI "!march=*:;march=fiji:;"
+#endif
+#ifdef HAVE_GCN_SRAM_ECC_GFX900
+#define A_900
+#else
+#define A_900 "march=gfx900:;"
+#endif
+#ifdef HAVE_GCN_SRAM_ECC_GFX906
+#define A_906
+#else
+#define A_906 "march=gfx906:;"
+#endif
+#ifdef HAVE_GCN_SRAM_ECC_GFX908
+#define A_908
+#else
+#define A_908 "march=gfx908:;"
+#endif
+
+/* These targets can't have SRAM-ECC, even if a broken assembler allows it.  */
 #define DRIVER_SELF_SPECS \
   "%{march=fiji|march=gfx900|march=gfx906:%{!msram-ecc=*:-msram-ecc=off}}"
 
@@ -83,7 +105,8 @@ extern unsigned int gcn_local_sym_hash (const char *name);
  "%:last_arg(%{march=*:-mcpu=%*}) " \
  "-mattr=%{mxnack:+xnack;:-xnack} " \
  /* FIXME: support "any" when we move to HSACOv4.  */ \
- "-mattr=%{!msram-ecc=off:+sram-ecc;:-sram-ecc} " \
+ "-mattr=%{" A_FIJI A_900 A_906 A_908 \
+   "!msram-ecc=off:+sram-ecc;:-sram-ecc} " \
  "-filetype=obj"
 #define LINK_SPEC "--pie --export-dynamic"
 #define LIB_SPEC  "-lc"
diff --git a/gcc/config/gcn/gcn.c b/gcc/config/gcn/gcn.c
index 385b90c4b00..d25c4e54e16 100644
--- a/gcc/config/gcn/gcn.c
+++ b/gcc/config/gcn/gcn.c
@@ -5181,18 +5181,39 @@ static void
 output_file_start (void)
 {
   const char *cpu;
+  bool use_sram = flag_sram_ecc;
   switch (gcn_arch)
 {
-case PROCESSOR_FIJI: cpu = "gfx803"; break;
-case PROCESSOR_VEGA10: cpu = "gfx900"; break;
-case PROCESSOR_VEGA20: cpu = "gfx906"; break;
-case PROCESSOR_GFX908: cpu = "gfx908"; break;
+case PROCESSOR_FIJI:
+  cpu = "gfx803";
+#ifndef HAVE_GCN_SRAM_ECC_FIJI
+  use_sram = false;
+#endif
+  break;
+case PROCESSOR_VEGA10:
+  cpu = "gfx900";
+#ifndef HAVE_GCN_SRAM_ECC_GFX900
+  use_sram = false;
+#endif
+  break;
+case PROCESSOR_VEGA20:
+  cpu = "gfx906";
+#ifndef HAVE_GCN_SRAM_ECC_GFX906
+  use_sram = false;
+#endif
+  break;
+case PROCESSOR_GFX908:
+  cpu = "gfx908";
+#ifndef HAVE_GCN_SRAM_ECC_GFX908
+  use_sram = false;
+#endif
+  break;
 default: gcc_unreachable ();
 }
 
   const char *xnack = (flag_xnack ? "+xnack" : "");
   /* FIXME: support "any" when we move to HSACOv4.  */
-  const char *sram_ecc = (flag_sram_ecc ? "+sram-ecc" : "");
+  const char *sram_ecc = (use_sram ? "+sram-ecc" : "");
 
   fprintf(asm_out_file, "\t.amdgcn_target \"amdgcn-unknown-amdhsa--%s%s%s\"\n",
  cpu, xnack, sram_ecc);
diff 

Re: [PATCH 1/2] Fix debug info for ignored decls at start of assembly

2021-07-28 Thread Richard Biener
On Mon, 26 Jul 2021, Bernd Edlinger wrote:

> Ignored functions decls that are compiled at the start of
> the assembly have bogus line numbers until the first .file
> directive, as reported in PR101575.
> 
> The work around for this issue is to emit a dummy .file
> directive when the first function is DECL_IGNORED_P, when
> that is not already done, mostly for -fdwarf-4.

I wonder if it makes sense to unconditionally announce the
TU with a .file directive at the beginning.  ISTR this is
what we now do with -gdwarf-5.

Note get_AT_string (comp_unit_die (), DW_AT_name) doesn't
work with LTO, you'll get  then.

Is the dwarf assembler bug reported/fixed?  Can you include
a reference please?

Thanks,
Richard.

> 2021-07-24  Bernd Edlinger  
> 
>   PR ada/101575
>   * dwarf2out.c (dwarf2out_begin_prologue): Move init
>   of fde->ignored_debug to dwarf2out_set_ignored_loc.
>   (dwarf2out_set_ignored_loc): This is now also called
>   when no .loc statement is to be generated, in that case
>   we emit a dummy .file statement when needed.
>   * final.c (final_start_function_1,
>   final_scan_insn_1): Call debug_hooks->set_ignored_loc
>   for all DECL_IGNORED_P functions.
> ---
>  gcc/dwarf2out.c | 29 +
>  gcc/final.c |  5 ++---
>  2 files changed, 27 insertions(+), 7 deletions(-)
> 
> diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
> index 884f1e1..8de0d6f 100644
> --- a/gcc/dwarf2out.c
> +++ b/gcc/dwarf2out.c
> @@ -1115,7 +1115,6 @@ dwarf2out_begin_prologue (unsigned int line 
> ATTRIBUTE_UNUSED,
>fde->dw_fde_current_label = dup_label;
>fde->in_std_section = (fnsec == text_section
>|| (cold_text_section && fnsec == cold_text_section));
> -  fde->ignored_debug = DECL_IGNORED_P (current_function_decl);
>in_text_section_p = fnsec == text_section;
>  
>/* We only want to output line number information for the genuine dwarf2
> @@ -28546,10 +28545,32 @@ dwarf2out_set_ignored_loc (unsigned int line, 
> unsigned int column,
>  {
>dw_fde_ref fde = cfun->fde;
>  
> -  fde->ignored_debug = false;
> -  set_cur_line_info_table (function_section (fde->decl));
> +  if (filename)
> +{
> +  set_cur_line_info_table (function_section (fde->decl));
> +
> +  dwarf2out_source_line (line, column, filename, 0, true);
> +}
> +  else
> +{
> +  fde->ignored_debug = true;
> +
> +  /* Work around for PR101575: output a dummy .file directive.  */
> +  if (in_first_function_p
> +   && debug_info_level >= DINFO_LEVEL_TERSE
> +   && dwarf_debuginfo_p ()
> +#if defined(HAVE_AS_GDWARF_5_DEBUG_FLAG) && 
> defined(HAVE_AS_WORKING_DWARF_N_FLAG)
> +   && dwarf_version <= 4
> +#endif
> +   && output_asm_line_debug_info ())
> + {
> +   const char *filename0 = get_AT_string (comp_unit_die (), DW_AT_name);
>  
> -  dwarf2out_source_line (line, column, filename, 0, true);
> +   if (filename0 == NULL)
> + filename0 = "";
> +   maybe_emit_file (lookup_filename (filename0));
> + }
> +}
>  }
>  
>  /* Record the beginning of a new source file.  */
> diff --git a/gcc/final.c b/gcc/final.c
> index ac6892d..82a5767 100644
> --- a/gcc/final.c
> +++ b/gcc/final.c
> @@ -1725,7 +1725,7 @@ final_start_function_1 (rtx_insn **firstp, FILE *file, 
> int *seen,
>if (!dwarf2_debug_info_emitted_p (current_function_decl))
>  dwarf2out_begin_prologue (0, 0, NULL);
>  
> -  if (DECL_IGNORED_P (current_function_decl) && last_linenum && 
> last_filename)
> +  if (DECL_IGNORED_P (current_function_decl))
>  debug_hooks->set_ignored_loc (last_linenum, last_columnnum, 
> last_filename);
>  
>  #ifdef LEAF_REG_REMAP
> @@ -2205,8 +2205,7 @@ final_scan_insn_1 (rtx_insn *insn, FILE *file, int 
> optimize_p ATTRIBUTE_UNUSED,
>   }
> else if (!DECL_IGNORED_P (current_function_decl))
>   debug_hooks->switch_text_section ();
> -   if (DECL_IGNORED_P (current_function_decl) && last_linenum
> -   && last_filename)
> +   if (DECL_IGNORED_P (current_function_decl))
>   debug_hooks->set_ignored_loc (last_linenum, last_columnnum,
> last_filename);
>  
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


[COMMITTED] Return undefined range on edges which are not executed.

2021-07-28 Thread Andrew MacLeod via Gcc-patches
outgoing_edge_range_p() is the GORI work engine which starts with the 
TRUE/FALSE/switch range from an edge at the bottom of the block, and 
calculates the outgoing range of any other ssa-name which can be changed 
by that.


When we rewrite a branch to always be true or false, we get slightly 
better results if we recognize it, and and simply return UNDEFINED for 
any ranges requested on the edge that cannot be taken.  The edge is 
slated to be removed and any values on the edge should henceforth be 
ignored. The cache updating mechanism will propagate/update any 
range-on-entry values in successor blocks.


Bootstrapped on x86_64-pc-linux-gnu with no regressions.  pushed.

Andrew


commit 04600a47224b1ff85c6fb870218b51969cceff21
Author: Andrew MacLeod 
Date:   Wed Jul 28 08:30:02 2021 -0400

Return undefined on edges which are not executed.

When a branch has been folded, mark any range requests on the unexecutable edge as
UNDEFINED.

* gimple-range-gori.cc (gori_compute::outgoing_edge_range_p): Check for
cond_false and cond_true on branches.

diff --git a/gcc/gimple-range-gori.cc b/gcc/gimple-range-gori.cc
index 17032acf8d7..c124b3c1ce4 100644
--- a/gcc/gimple-range-gori.cc
+++ b/gcc/gimple-range-gori.cc
@@ -1104,6 +1104,21 @@ gori_compute::outgoing_edge_range_p (irange , edge e, tree name,
 
   fur_stmt src (stmt, );
 
+  // If this edge is never taken, return undefined.
+  gcond *gc = dyn_cast (stmt);
+  if (gc)
+{
+  if (((e->flags & EDGE_TRUE_VALUE) && gimple_cond_false_p (gc))
+	  || ((e->flags & EDGE_FALSE_VALUE) && gimple_cond_true_p (gc)))
+	{
+	  r.set_undefined ();
+	  if (dump_file && (dump_flags & TDF_DETAILS))
+	  fprintf (dump_file, "Outgoing edge %d->%d unexecutable.\n",
+		   e->src->index, e->dest->index);
+	  return true;
+	}
+}
+
   // If NAME can be calculated on the edge, use that.
   if (is_export_p (name, e->src))
 {


[PATCH take 2] Fold (X<

2021-07-28 Thread Roger Sayle

Hi Marc,

Thanks for the feedback.  After some quality time in gdb, I now appreciate
that
match.pd behaves (subtly) differently between generic and gimple, and the
trees actually being passed to tree_nonzero_bits were not quite what I had
expected.  Sorry for my confusion, the revised patch below is now much
shorter
(and my follow-up patch that was originally to tree_nonzero_bits looks like
it
now needs to be to get_nonzero_bits!).

This revised patch has been retested on 864_64-pc-linux-gnu with a
"make bootstrap" and "make -k check" with no new failures.

Ok for mainline?

2021-07-28  Roger Sayle  
Marc Glisse 

gcc/ChangeLog
* match.pd (bit_ior, bit_xor): Canonicalize (X*C1)|(X*C2) and
(X*C1)^(X*C2) as X*(C1+C2), and related variants, using
tree_nonzero_bits to ensure that operands are bit-wise disjoint.

gcc/testsuite/ChangeLog
* gcc.dg/fold-ior-4.c: New test.

Roger
--

-Original Message-
From: Marc Glisse  
Sent: 26 July 2021 16:45
To: Roger Sayle 
Cc: 'GCC Patches' 
Subject: Re: [PATCH] Fold (X< The one aspect that's a little odd is that each transform is paired 
> with a convert@1 variant, using the efficient match machinery to 
> expose any zero extension to fold-const.c's tree_nonzero_bits
functionality.

Copying the first transform for context

+(for op (bit_ior bit_xor)
+ (simplify
+  (op (mult:s@0 @1 INTEGER_CST@2)
+  (mult:s@3 @1 INTEGER_CST@4))
+  (if (INTEGRAL_TYPE_P (type) && TYPE_OVERFLOW_WRAPS (type)
+   && (tree_nonzero_bits (@0) & tree_nonzero_bits (@3)) == 0)
+   (mult @1
+{ wide_int_to_tree (type, wi::to_wide (@2) + wi::to_wide (@4));
})))  
+(simplify
+  (op (mult:s@0 (convert@1 @2) INTEGER_CST@3)
+  (mult:s@4 (convert@1 @2) INTEGER_CST@5))
+  (if (INTEGRAL_TYPE_P (type) && TYPE_OVERFLOW_WRAPS (type)
+   && (tree_nonzero_bits (@0) & tree_nonzero_bits (@4)) == 0)
+   (mult @1
+{ wide_int_to_tree (type, wi::to_wide (@3) + wi::to_wide (@5));
})))

Could you explain how the convert helps exactly?

--
Marc Glisse
diff --git a/gcc/match.pd b/gcc/match.pd
index beb8d27..5bc6851 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2833,6 +2833,62 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (convert (mult (convert:t @0) { cst; })
 #endif
 
+/* Canonicalize (X*C1)|(X*C2) and (X*C1)^(X*C2) to (C1+C2)*X when
+   tree_nonzero_bits allows IOR and XOR to be treated like PLUS.
+   Likewise, handle (X< 0
+   && (tree_nonzero_bits (@0) & tree_nonzero_bits (@3)) == 0)
+   (with { wide_int wone = wi::one (TYPE_PRECISION (type));
+  wide_int c = wi::add (wi::to_wide (@2),
+wi::lshift (wone, wi::to_wide (@4))); }
+(mult @1 { wide_int_to_tree (type, c); }
+ (simplify
+  (op:c (mult:s@0 @1 INTEGER_CST@2)
+   @1)
+  (if (INTEGRAL_TYPE_P (type) && TYPE_OVERFLOW_WRAPS (type)
+   && (tree_nonzero_bits (@0) & tree_nonzero_bits (@1)) == 0)
+   (mult @1
+{ wide_int_to_tree (type,
+wi::add (wi::to_wide (@2), 1)); })))
+ (simplify
+  (op (lshift:s@0 @1 INTEGER_CST@2)
+  (lshift:s@3 @1 INTEGER_CST@4))
+  (if (INTEGRAL_TYPE_P (type)
+   && tree_int_cst_sgn (@2) > 0
+   && tree_int_cst_sgn (@4) > 0
+   && (tree_nonzero_bits (@0) & tree_nonzero_bits (@3)) == 0)
+   (with { tree t = type;
+  if (!TYPE_OVERFLOW_WRAPS (t))
+t = unsigned_type_for (t);
+  wide_int wone = wi::one (TYPE_PRECISION (t));
+  wide_int c = wi::add (wi::lshift (wone, wi::to_wide (@2)),
+wi::lshift (wone, wi::to_wide (@4))); }
+(convert (mult:t (convert:t @1) { wide_int_to_tree (t,c); })
+ (simplify
+  (op:c (lshift:s@0 @1 INTEGER_CST@2)
+   @1)
+  (if (INTEGRAL_TYPE_P (type)
+   && tree_int_cst_sgn (@2) > 0
+   && (tree_nonzero_bits (@0) & tree_nonzero_bits (@1)) == 0)
+   (with { tree t = type;
+  if (!TYPE_OVERFLOW_WRAPS (t))
+t = unsigned_type_for (t);
+  wide_int wone = wi::one (TYPE_PRECISION (t));
+  wide_int c = wi::add (wi::lshift (wone, wi::to_wide (@2)), wone); }
+(convert (mult:t (convert:t @1) { wide_int_to_tree (t, c); }))
+
 /* Simplifications of MIN_EXPR, MAX_EXPR, fmin() and fmax().  */
 
 (for minmax (min max FMIN_ALL FMAX_ALL)
/* { dg-do compile } */
/* { dg-options "-O2 -fdump-tree-optimized" } */

unsigned int test_ior(unsigned char i)
{
  return i | (i<<8) | (i<<16) | (i<<24);
}

unsigned int test_xor(unsigned char i)
{
  return i ^ (i<<8) ^ (i<<16) ^ (i<<24);
}

unsigned int test_ior_1s(unsigned char i)
{
  return i | (i<<8);
}

unsigned int test_ior_1u(unsigned char i)
{
  unsigned int t = i;
  return t | (t<<8);
}

unsigned int test_xor_1s(unsigned char i)
{
  return i ^ (i<<8);
}

unsigned int test_xor_1u(unsigned char i)
{
  unsigned int t = i;
  return t ^ (t<<8);
}

unsigned int test_ior_2s(unsigned char i)
{
  return (i<<8) | (i<<16);
}

unsigned int test_ior_2u(unsigned char i)
{
  unsigned int 

Re: [PATCH] [i386] Add a separate function to calculate cost for WIDEN_MULT_EXPR.

2021-07-28 Thread Richard Biener via Gcc-patches
On Wed, Jul 28, 2021 at 10:35 AM liuhongt  wrote:
>
> Hi:
>   As described in PR 39821, WIDEN_MULT_EXPR should use a different cost
> model from MULT_EXPR, this patch add ix86_widen_mult_cost for that.
> Reference basis for the cost model is https://godbolt.org/z/EMjaz4Knn.
>
>   Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
>
> gcc/ChangeLog:

can you reference PR target/39821 please?

> * config/i386/i386.c (ix86_widen_mult_cost): New function.
> (ix86_add_stmt_cost): Use ix86_widen_mult_cost for
> WIDEN_MULT_EXPR.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/sse2-pr39821.c: New test.
> * gcc.target/i386/sse4-pr39821.c: New test.
> ---
>  gcc/config/i386/i386.c   | 48 +++-
>  gcc/testsuite/gcc.target/i386/sse2-pr39821.c | 45 ++
>  gcc/testsuite/gcc.target/i386/sse4-pr39821.c |  4 ++
>  3 files changed, 96 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse2-pr39821.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/sse4-pr39821.c
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 876a19f4c1f..281b5fe2706 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -19757,6 +19757,44 @@ ix86_vec_cost (machine_mode mode, int cost)
>return cost;
>  }
>
> +/* Return cost of vec_widen_mult_hi/lo_,
> +   vec_widen_mul_hi/lo_ is only available for VI124_AVX2.  */
> +static int
> +ix86_widen_mult_cost (const struct processor_costs *cost,
> + enum machine_mode mode, bool uns_p)
> +{
> +  gcc_assert (GET_MODE_CLASS (mode) == MODE_VECTOR_INT);
> +  int extra_cost = 0;
> +  int basic_cost = 0;
> +  switch (mode)
> +{
> +case V8HImode:
> +case V16HImode:
> +  if (!uns_p || mode == V16HImode)
> +   extra_cost = cost->sse_op * 2;
> +  basic_cost = cost->mulss * 2 + cost->sse_op * 4;
> +  break;
> +case V4SImode:
> +case V8SImode:
> +  /* pmulhw/pmullw can be used.  */
> +  basic_cost = cost->mulss * 2 + cost->sse_op * 2;
> +  break;
> +case V2DImode:
> +  /* pmuludq under sse2, pmuldq under sse4.1, for sign_extend,
> +require extra 4 mul, 4 add, 4 cmp and 2 shift.  */
> +  if (!TARGET_SSE4_1 && !uns_p)
> +   extra_cost = (cost->mulss + cost->addss + cost->sse_op) * 4
> + + cost->sse_op * 2;
> +  /* Fallthru.  */
> +case V4DImode:
> +  basic_cost = cost->mulss * 2 + cost->sse_op * 4;
> +  break;
> +default:
> +  gcc_unreachable();
> +}
> +  return ix86_vec_cost (mode, basic_cost + extra_cost);
> +}
> +
>  /* Return cost of multiplication in MODE.  */
>
>  static int
> @@ -22483,10 +22521,18 @@ ix86_add_stmt_cost (class vec_info *vinfo, void 
> *data, int count,
>   break;
>
> case MULT_EXPR:
> -   case WIDEN_MULT_EXPR:
> + /*For MULT_HIGHPART_EXPR, x86 only supports pmulhw,

Space after /*

otherwise OK.

> +   take it as MULT_EXPR.  */
> case MULT_HIGHPART_EXPR:
>   stmt_cost = ix86_multiplication_cost (ix86_cost, mode);
>   break;
> + /* There's no direct instruction for WIDEN_MULT_EXPR,
> +take emulation into account.  */
> +   case WIDEN_MULT_EXPR:
> + stmt_cost = ix86_widen_mult_cost (ix86_cost, mode,
> +   TYPE_UNSIGNED (vectype));
> + break;
> +
> case NEGATE_EXPR:
>   if (SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH)
> stmt_cost = ix86_cost->sse_op;
> diff --git a/gcc/testsuite/gcc.target/i386/sse2-pr39821.c 
> b/gcc/testsuite/gcc.target/i386/sse2-pr39821.c
> new file mode 100644
> index 000..bcd4b772c98
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/sse2-pr39821.c
> @@ -0,0 +1,45 @@
> +/* { dg-do compile } */
> +/* { dg-options "-msse2 -mno-sse4.1 -O3 -fdump-tree-vect-details" } */
> +/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 5 
> "vect" } } */
> +#include
> +void
> +vec_widen_smul8 (int16_t* __restrict v3, int8_t *v1, int8_t *v2, int order)
> +{
> +  while (order--)
> +*v3++ = (int16_t) *v1++ * *v2++;
> +}
> +
> +void
> +vec_widen_umul8(uint16_t* __restrict v3, uint8_t *v1, uint8_t *v2, int order)
> +{
> +  while (order--)
> +*v3++ = (uint16_t) *v1++ * *v2++;
> +}
> +
> +void
> +vec_widen_smul16(int32_t* __restrict v3, int16_t *v1, int16_t *v2, int order)
> +{
> +  while (order--)
> +*v3++ = (int32_t) *v1++ * *v2++;
> +}
> +
> +void
> +vec_widen_umul16(uint32_t* __restrict v3, uint16_t *v1, uint16_t *v2, int 
> order)
> +{
> +  while (order--)
> +*v3++ = (uint32_t) *v1++ * *v2++;
> +}
> +
> +void
> +vec_widen_smul32(int64_t* __restrict v3, int32_t *v1, int32_t *v2, int order)
> +{
> +  while (order--)
> +*v3++ = (int64_t) *v1++ * *v2++;
> +}
> +
> +void
> +vec_widen_umul32(uint64_t* __restrict v3, uint32_t *v1, uint32_t *v2, int 
> order)
> +{
> +  while (order--)
> +  

Re: retain debug stmt order when moving to successors

2021-07-28 Thread Richard Biener via Gcc-patches
On Wed, Jul 28, 2021 at 10:12 AM Alexandre Oliva  wrote:
>
>
> We iterate over debug stmts from the last one in new_bb, and we insert
> them before the first post-label stmt in each dest block, without
> moving the insertion iterator, so they end up reversed.  Moving the
> insertion iterator fixes this.
>
> Regstrapped on x86_64-linux-gnu.  Ok to install?

OK.

Richard.

> for  gcc/ChangeLog
>
> * tree-inline.c (maybe_move_debug_stmts_to_successors): Don't
> reverse debug stmts.
> ---
>  gcc/tree-inline.c |4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
> index 4a07d88f10bc5..b188a21df0e07 100644
> --- a/gcc/tree-inline.c
> +++ b/gcc/tree-inline.c
> @@ -2868,7 +2868,7 @@ maybe_move_debug_stmts_to_successors (copy_body_data 
> *id, basic_block new_bb)
>   gimple_set_location (stmt, UNKNOWN_LOCATION);
> }
>   gsi_remove (, false);
> - gsi_insert_before (, stmt, GSI_SAME_STMT);
> + gsi_insert_before (, stmt, GSI_NEW_STMT);
>   continue;
> }
>
> @@ -2894,7 +2894,7 @@ maybe_move_debug_stmts_to_successors (copy_body_data 
> *id, basic_block new_bb)
> new_stmt = as_a  (gimple_copy (stmt));
>   else
> gcc_unreachable ();
> - gsi_insert_before (, new_stmt, GSI_SAME_STMT);
> + gsi_insert_before (, new_stmt, GSI_NEW_STMT);
>   id->debug_stmts.safe_push (new_stmt);
>   gsi_prev ();
> }
>
>
> --
> Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
>Free Software Activist   GNU Toolchain Engineer
> Disinformation flourishes because many people care deeply about injustice
> but very few check the facts.  Ask me about 


Re: don't access cfun in dump_function_to_file

2021-07-28 Thread Richard Biener via Gcc-patches
On Wed, Jul 28, 2021 at 10:12 AM Alexandre Oliva  wrote:
>
>
> dump_function_to_file takes the function to dump as a parameter, and
> parts of it use the local fun variable where cfun would be used
> elsewhere.  Others use cfun, presumably in error.  Fixed to use fun
> uniformly.  Added a few more tests for non-NULL fun before
> dereferencing it.
>
> Regstrapped on x86_64-linux-gnu.  Ok to install?

OK.

Richard.

>
> for  gcc/ChangeLog
>
> * tree-cfg.c (dump_function_to_file): Use fun, not cfun.
> ---
>  gcc/tree-cfg.c |   10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
> index 30b1b56293e3b..38269a27b7978 100644
> --- a/gcc/tree-cfg.c
> +++ b/gcc/tree-cfg.c
> @@ -8074,9 +8074,9 @@ dump_function_to_file (tree fndecl, FILE *file, 
> dump_flags_t flags)
>: (fun->curr_properties & PROP_cfg) ? "cfg"
>: "");
>
> -  if (cfun->cfg)
> +  if (fun && fun->cfg)
> {
> - basic_block bb = ENTRY_BLOCK_PTR_FOR_FN (cfun);
> + basic_block bb = ENTRY_BLOCK_PTR_FOR_FN (fun);
>   if (bb->count.initialized_p ())
> fprintf (file, ",%s(%" PRIu64 ")",
>  profile_quality_as_string (bb->count.quality ()),
> @@ -8162,8 +8162,8 @@ dump_function_to_file (tree fndecl, FILE *file, 
> dump_flags_t flags)
>
>tree name;
>
> -  if (gimple_in_ssa_p (cfun))
> -   FOR_EACH_SSA_NAME (ix, name, cfun)
> +  if (gimple_in_ssa_p (fun))
> +   FOR_EACH_SSA_NAME (ix, name, fun)
>   {
> if (!SSA_NAME_VAR (name)
> /* SSA name with decls without a name still get
> @@ -8199,7 +8199,7 @@ dump_function_to_file (tree fndecl, FILE *file, 
> dump_flags_t flags)
>
>fprintf (file, "}\n");
>  }
> -  else if (fun->curr_properties & PROP_gimple_any)
> +  else if (fun && (fun->curr_properties & PROP_gimple_any))
>  {
>/* The function is now in GIMPLE form but the CFG has not been
>  built yet.  Emit the single sequence of GIMPLE statements
>
> --
> Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
>Free Software Activist   GNU Toolchain Engineer
> Disinformation flourishes because many people care deeply about injustice
> but very few check the facts.  Ask me about 


Re: [PATCH] Adjust docu of TARGET_VECTORIZE_VEC_PERM_CONST

2021-07-28 Thread Richard Biener via Gcc-patches
On Wed, Jul 28, 2021 at 10:19 AM Andreas Krebbel  wrote:
>
> On 7/28/21 9:43 AM, Richard Biener wrote:
> > On Wed, Jul 28, 2021 at 8:44 AM Andreas Krebbel via Gcc-patches
> >  wrote:
> >>
> >> There are also memory operands passed for in0 and in1.
> >>
> >> Ok for mainline?
> >
> > They can also be constant vectors, I'd just not specify the operand
> > kind - usually
> > expanders are not limited as to what they feed down.
>
> Right, I'll just replace "registers" with "operands" then. Ok?

OK.

Richard.

>  also to emit such a permutation.  In the former case @var{in0}, @var{in1}\n\
>  and @var{out} are all null.  In the latter case @var{in0} and @var{in1} 
> are\n\
>  the source vectors and @var{out} is the destination vector; all three are\n\
> -registers of mode @var{mode}.  @var{in1} is the same as @var{in0} if\n\
> +operands of mode @var{mode}.  @var{in1} is the same as @var{in0} if\n\
>  @var{sel} describes a permutation on one vector instead of two.\n\
>  \n\
>  Return true if the operation is possible, emitting instructions for it\n\
>
> Andreas


Re: [PATCH] analyzer: Handle strdup builtins

2021-07-28 Thread David Malcolm via Gcc-patches
On Wed, 2021-07-28 at 15:49 +0530, Siddhesh Poyarekar wrote:
> Consolidate allocator builtin handling and add support for
> __builtin_strdup and __builtin_strndup.
> 
> gcc/analyzer/ChangeLog:
> * analyzer.cc (is_named_call_p, is_std_named_call_p): Make
> first argument a const_tree.
> * analyzer.h (is_named_call_p, -s_std_named_call_p):
> Likewise.
> * sm-malloc.cc (known_allocator_p): New function.
> (malloc_state_machine::on_stmt): Use it.
> 
> gcc/testsuite/ChangeLog:
> * gcc.dg/analyzer/strdup-1.c (test_4, test_5, test_6): New
> tests.

Looks good to me

Thanks
Dave




Re: [PATCH] analyzer: Recognize __builtin_free as a matching deallocator

2021-07-28 Thread David Malcolm via Gcc-patches
On Wed, 2021-07-28 at 10:34 +0530, Siddhesh Poyarekar wrote:
> Recognize __builtin_free as being equivalent to free when passed into
> __attribute__((malloc ())), similar to how it is treated when it is
> encountered as a call.  This fixes spurious warnings in glibc where
> xmalloc family of allocators as well as reallocarray, memalign,
> etc. are declared to have __builtin_free as the free function.
> 
> gcc/analyzer/ChangeLog:
> * sm-malloc.cc
> (malloc_state_machine::get_or_create_deallocator): Recognize
> __builtin_free.
> 
> gcc/testsuite/ChangeLog:
> * gcc.dg/analyzer/attr-malloc-1.c (compatible_alloc,
> compatible_alloc2): New extern allocator declarations.
> (test_9, test_10): New tests.

Looks good to me, thanks
Dave




[committed] d: Wrong evaluation order of binary expressions (PR101640)

2021-07-28 Thread Iain Buclaw via Gcc-patches
Hi,

The use of fold_build2 can in some cases swap the order of its operands
if that is the more optimal thing to do.  However this breaks semantic
guarantee of left-to-right evaluation in D.

Bootstrapped and regression tested on x86_64-linux-gnu/-m32/-mx32.
Committed to mainline, and backported to the gcc-9, gcc-10, and gcc-11
release branches.

Regards,
Iain.

---
gcc/d/ChangeLog:

PR d/101640
* expr.cc (binary_op): Use build2 instead of fold_build2.

gcc/testsuite/ChangeLog:

PR d/101640
* gdc.dg/pr96429.d: Update test.
* gdc.dg/pr101640.d: New test.
---
 gcc/d/expr.cc   |  2 +-
 gcc/testsuite/gdc.dg/pr101640.d | 11 +++
 gcc/testsuite/gdc.dg/pr96429.d  |  2 +-
 3 files changed, 13 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gdc.dg/pr101640.d

diff --git a/gcc/d/expr.cc b/gcc/d/expr.cc
index e76cae98f7e..b78778eb8ef 100644
--- a/gcc/d/expr.cc
+++ b/gcc/d/expr.cc
@@ -157,7 +157,7 @@ binary_op (tree_code code, tree type, tree arg0, tree arg1)
  eptype = type;
}
 
-  ret = fold_build2 (code, eptype, arg0, arg1);
+  ret = build2 (code, eptype, arg0, arg1);
 }
 
   return d_convert (type, ret);
diff --git a/gcc/testsuite/gdc.dg/pr101640.d b/gcc/testsuite/gdc.dg/pr101640.d
new file mode 100644
index 000..68de4088512
--- /dev/null
+++ b/gcc/testsuite/gdc.dg/pr101640.d
@@ -0,0 +1,11 @@
+// https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101640
+// { dg-do compile }
+// { dg-options "-fdump-tree-original" }
+
+int fun101640(ref int);
+
+int test101640(int val)
+{
+// { dg-final { scan-tree-dump "= val \\\+ fun101640 \\\(\\\(int &\\\) 
\\\);" "original" } }
+return val + fun101640(val);
+}
diff --git a/gcc/testsuite/gdc.dg/pr96429.d b/gcc/testsuite/gdc.dg/pr96429.d
index af096e26b5a..9940a03e0ec 100644
--- a/gcc/testsuite/gdc.dg/pr96429.d
+++ b/gcc/testsuite/gdc.dg/pr96429.d
@@ -3,7 +3,7 @@
 // { dg-options "-fdump-tree-original" }
 ptrdiff_t subbyte(byte* bp1, byte* bp2)
 {
-// { dg-final { scan-tree-dump "bp1 - bp2;" "original" } }
+// { dg-final { scan-tree-dump "\\\(bp1 - bp2\\\) /\\\[ex\\\] 1;" 
"original" } }
 return bp1 - bp2;
 }
 
-- 
2.30.2



[committed] d: fix ICE at convert_expr(tree_node*, Type*, Type*) (PR101490)

2021-07-28 Thread Iain Buclaw via Gcc-patches
Hi,

This patch fixes a modulo by zero bug, seen in both the front-end and
code generator when testing if a conversion from a static array to
dynamic array was valid.

Bootstrapped and regression tested on x86_64-linux-gnu/-m32/-mx32.
Committed to mainline, and backported to the gcc-9, gcc-10, and gcc-11
release branches.

Regards,
Iain.

---
gcc/d/ChangeLog:

PR d/101490
* dmd/MERGE: Merge upstream dmd 27e388b4c.
* d-codegen.cc (build_array_index): Handle void arrays same as byte.
* d-convert.cc (convert_expr): Handle converting to zero-sized arrays.

gcc/testsuite/ChangeLog:

PR d/101490
* gdc.dg/pr101490.d: New test.
---
 gcc/d/d-codegen.cc| 16 ++
 gcc/d/d-convert.cc| 15 -
 gcc/d/dmd/MERGE   |  2 +-
 gcc/d/dmd/dcast.c | 15 +++--
 gcc/testsuite/gdc.dg/pr101490.d   | 21 +++
 .../gdc.test/fail_compilation/fail22144.d | 14 +
 6 files changed, 57 insertions(+), 26 deletions(-)
 create mode 100644 gcc/testsuite/gdc.dg/pr101490.d
 create mode 100644 gcc/testsuite/gdc.test/fail_compilation/fail22144.d

diff --git a/gcc/d/d-codegen.cc b/gcc/d/d-codegen.cc
index ce7c17baaaf..f35de90b54c 100644
--- a/gcc/d/d-codegen.cc
+++ b/gcc/d/d-codegen.cc
@@ -1639,21 +1639,9 @@ build_array_index (tree ptr, tree index)
   /* Array element size.  */
   tree size_exp = size_in_bytes (target_type);
 
-  if (integer_zerop (size_exp))
+  if (integer_zerop (size_exp) || integer_onep (size_exp))
 {
-  /* Test for array of void.  */
-  if (TYPE_MODE (target_type) == TYPE_MODE (void_type_node))
-   index = fold_convert (type, index);
-  else
-   {
- /* Should catch this earlier.  */
- error ("invalid use of incomplete type %qD", TYPE_NAME (target_type));
- ptr_type = error_mark_node;
-   }
-}
-  else if (integer_onep (size_exp))
-{
-  /* Array of bytes -- No need to multiply.  */
+  /* Array of void or bytes -- No need to multiply.  */
   index = fold_convert (type, index);
 }
   else
diff --git a/gcc/d/d-convert.cc b/gcc/d/d-convert.cc
index 3073edaae9f..237c084acf5 100644
--- a/gcc/d/d-convert.cc
+++ b/gcc/d/d-convert.cc
@@ -473,13 +473,18 @@ convert_expr (tree exp, Type *etype, Type *totype)
 
  tree ptrtype = build_ctype (tbtype->nextOf ()->pointerTo ());
 
- if ((dim * esize) % tsize != 0)
+ if (esize != tsize)
{
- error ("cannot cast %qs to %qs since sizes do not line up",
-etype->toChars (), totype->toChars ());
- return error_mark_node;
+ /* Array element sizes do not match, so we must adjust the
+dimensions.  */
+ if (tsize == 0 || (dim * esize) % tsize != 0)
+   {
+ error ("cannot cast %qs to %qs since sizes do not line up",
+etype->toChars (), totype->toChars ());
+ return error_mark_node;
+   }
+ dim = (dim * esize) / tsize;
}
- dim = (dim * esize) / tsize;
 
  /* Assumes casting to dynamic array of same type or void.  */
  return d_array_value (build_ctype (totype), size_int (dim),
diff --git a/gcc/d/dmd/MERGE b/gcc/d/dmd/MERGE
index 08bd50df212..2568993fbf4 100644
--- a/gcc/d/dmd/MERGE
+++ b/gcc/d/dmd/MERGE
@@ -1,4 +1,4 @@
-f8c1ca928360dd8c9f2fbb5771e2a5e398878ca0
+27e388b4c4d292cac25811496aaf79341c05c940
 
 The first line of this file holds the git revision number of the last
 merge done from the dlang/dmd repository.
diff --git a/gcc/d/dmd/dcast.c b/gcc/d/dmd/dcast.c
index 4dd648bcc48..d84ab7ffc21 100644
--- a/gcc/d/dmd/dcast.c
+++ b/gcc/d/dmd/dcast.c
@@ -1496,13 +1496,16 @@ Expression *castTo(Expression *e, Scope *sc, Type *t)
 // cast(U[])sa; // ==> cast(U[])sa[];
 d_uns64 fsize = t1b->nextOf()->size();
 d_uns64 tsize = tob->nextOf()->size();
-if TypeSArray *)t1b)->dim->toInteger() * fsize) % 
tsize != 0)
+if (fsize != tsize)
 {
-// copied from sarray_toDarray() in e2ir.c
-e->error("cannot cast expression %s of type %s to %s 
since sizes don't line up",
-e->toChars(), e->type->toChars(), t->toChars());
-result = new ErrorExp();
-return;
+dinteger_t dim = ((TypeSArray *)t1b)->dim->toInteger();
+if (tsize == 0 || (dim * fsize) % tsize != 0)
+{
+e->error("cannot cast expression `%s` of type `%s` 
to `%s` since sizes don't line up",
+ e->toChars(), e->type->toChars(), 
t->toChars());
+   

[committed] d: __FUNCTION__ doesn't work in core.stdc.stdio functions without cast (PR101441)

2021-07-28 Thread Iain Buclaw via Gcc-patches
Hi,

This patch backports a fix from upstream to allow __FUNCTION__ and
__PRETTY_FUNCTION__ to be used as C string literals.

Bootstrapped and regression tested on x86_64-linux-gnu/-m32/-mx32.
Committed to mainline, and backported to the gcc-9, gcc-10, and gcc-11
release branches.

Regards,
Iain.

---
gcc/d/ChangeLog:

PR d/101441
* dmd/MERGE: Merge upstream dmd f8c1ca928.
---
 gcc/d/dmd/MERGE|  2 +-
 gcc/d/dmd/expression.c |  4 ++--
 gcc/testsuite/gdc.test/compilable/b19002.d | 12 
 3 files changed, 15 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gdc.test/compilable/b19002.d

diff --git a/gcc/d/dmd/MERGE b/gcc/d/dmd/MERGE
index 127f9f8aa86..08bd50df212 100644
--- a/gcc/d/dmd/MERGE
+++ b/gcc/d/dmd/MERGE
@@ -1,4 +1,4 @@
-1d8386a63d412c9e77728b0b965025ac4dd40b75
+f8c1ca928360dd8c9f2fbb5771e2a5e398878ca0
 
 The first line of this file holds the git revision number of the last
 merge done from the dlang/dmd repository.
diff --git a/gcc/d/dmd/expression.c b/gcc/d/dmd/expression.c
index 7166f972424..18aa6aa9ab4 100644
--- a/gcc/d/dmd/expression.c
+++ b/gcc/d/dmd/expression.c
@@ -5620,7 +5620,7 @@ Expression *FuncInitExp::resolveLoc(Loc loc, Scope *sc)
 s = "";
 Expression *e = new StringExp(loc, const_cast(s));
 e = expressionSemantic(e, sc);
-e = e->castTo(sc, type);
+e->type = Type::tstring;
 return e;
 }
 
@@ -5654,7 +5654,7 @@ Expression *PrettyFuncInitExp::resolveLoc(Loc loc, Scope 
*sc)
 
 Expression *e = new StringExp(loc, const_cast(s));
 e = expressionSemantic(e, sc);
-e = e->castTo(sc, type);
+e->type = Type::tstring;
 return e;
 }
 
diff --git a/gcc/testsuite/gdc.test/compilable/b19002.d 
b/gcc/testsuite/gdc.test/compilable/b19002.d
new file mode 100644
index 000..fd8e6d18b37
--- /dev/null
+++ b/gcc/testsuite/gdc.test/compilable/b19002.d
@@ -0,0 +1,12 @@
+module b19002;
+
+void printf(scope const char* format){}
+
+void main()
+{
+printf(__FILE__);
+printf(__FILE_FULL_PATH__);
+printf(__FUNCTION__);
+printf(__PRETTY_FUNCTION__);
+printf(__MODULE__);
+}
-- 
2.30.2



[committed] d: Compile-time reflection for supported built-ins (PR101127)

2021-07-28 Thread Iain Buclaw via Gcc-patches
Hi,

In order to allow user-code to determine whether a back-end builtin is
available without error, LANG_HOOKS_BUILTIN_FUNCTION_EXT_SCOPE has been
defined to delay putting back-end builtin functions until the ISA that
defines them has been declared.

However in D, there is no global namespace.  All builtins get pushed
into the `gcc.builtins' module, which is constructed during the semantic
analysis pass, which has already finished by the time target attributes
are evaluated.  So builtins are not pushed by the new langhook because
they would be ultimately ignored.  Builtins exposed to D code then can
now only be altered by the command-line.

Bootstrapped and regression tested on x86_64-linux-gnu/-m32/-mx32.
Committed to mainline, and backported to the gcc-9, gcc-10, and gcc-11
release branches.

Regards,
Iain.

---
gcc/d/ChangeLog:

PR d/101127
* d-builtins.cc (d_builtin_function_ext_scope): New function.
* d-lang.cc (LANG_HOOKS_BUILTIN_FUNCTION_EXT_SCOPE): Define.
* d-tree.h (d_builtin_function_ext_scope): Declare.

gcc/testsuite/ChangeLog:

PR d/101127
* gdc.dg/pr101127a.d: New test.
* gdc.dg/pr101127b.d: New test.
---
 gcc/d/d-builtins.cc  | 15 +++
 gcc/d/d-lang.cc  |  2 ++
 gcc/d/d-tree.h   |  1 +
 gcc/testsuite/gdc.dg/pr101127a.d |  8 
 gcc/testsuite/gdc.dg/pr101127b.d |  7 +++
 5 files changed, 33 insertions(+)
 create mode 100644 gcc/testsuite/gdc.dg/pr101127a.d
 create mode 100644 gcc/testsuite/gdc.dg/pr101127b.d

diff --git a/gcc/d/d-builtins.cc b/gcc/d/d-builtins.cc
index 859a8ce2a59..ff2a5776dc5 100644
--- a/gcc/d/d-builtins.cc
+++ b/gcc/d/d-builtins.cc
@@ -1204,5 +1204,20 @@ d_builtin_function (tree decl)
   return decl;
 }
 
+/* Same as d_builtin_function, but used to delay putting in back-end builtin
+   functions until the ISA that defines the builtin has been declared.
+   However in D, there is no global namespace.  All builtins get pushed into 
the
+   `gcc.builtins' module, which is constructed during the semantic analysis
+   pass, which has already finished by the time target attributes are 
evaluated.
+   So builtins are not pushed because they would be ultimately ignored.
+   The purpose of having this function then is to improve compile-time
+   reflection support to allow user-code to determine whether a given back end
+   function is enabled by the ISA.  */
+
+tree
+d_builtin_function_ext_scope (tree decl)
+{
+  return decl;
+}
 
 #include "gt-d-d-builtins.h"
diff --git a/gcc/d/d-lang.cc b/gcc/d/d-lang.cc
index a65af290cb8..6ad3823d910 100644
--- a/gcc/d/d-lang.cc
+++ b/gcc/d/d-lang.cc
@@ -1745,6 +1745,7 @@ d_enum_underlying_base_type (const_tree type)
 #undef LANG_HOOKS_GET_ALIAS_SET
 #undef LANG_HOOKS_TYPES_COMPATIBLE_P
 #undef LANG_HOOKS_BUILTIN_FUNCTION
+#undef LANG_HOOKS_BUILTIN_FUNCTION_EXT_SCOPE
 #undef LANG_HOOKS_REGISTER_BUILTIN_TYPE
 #undef LANG_HOOKS_FINISH_INCOMPLETE_DECL
 #undef LANG_HOOKS_GIMPLIFY_EXPR
@@ -1776,6 +1777,7 @@ d_enum_underlying_base_type (const_tree type)
 #define LANG_HOOKS_GET_ALIAS_SET   d_get_alias_set
 #define LANG_HOOKS_TYPES_COMPATIBLE_P  d_types_compatible_p
 #define LANG_HOOKS_BUILTIN_FUNCTIONd_builtin_function
+#define LANG_HOOKS_BUILTIN_FUNCTION_EXT_SCOPE d_builtin_function_ext_scope
 #define LANG_HOOKS_REGISTER_BUILTIN_TYPEd_register_builtin_type
 #define LANG_HOOKS_FINISH_INCOMPLETE_DECL   d_finish_incomplete_decl
 #define LANG_HOOKS_GIMPLIFY_EXPR   d_gimplify_expr
diff --git a/gcc/d/d-tree.h b/gcc/d/d-tree.h
index 6ef9af2a991..b03d60a5c0e 100644
--- a/gcc/d/d-tree.h
+++ b/gcc/d/d-tree.h
@@ -502,6 +502,7 @@ extern const attribute_spec 
d_langhook_common_attribute_table[];
 extern Type *build_frontend_type (tree);
 
 extern tree d_builtin_function (tree);
+extern tree d_builtin_function_ext_scope (tree);
 extern void d_init_builtins (void);
 extern void d_register_builtin_type (tree, const char *);
 extern void d_build_builtins_module (Module *);
diff --git a/gcc/testsuite/gdc.dg/pr101127a.d b/gcc/testsuite/gdc.dg/pr101127a.d
new file mode 100644
index 000..b56398e1929
--- /dev/null
+++ b/gcc/testsuite/gdc.dg/pr101127a.d
@@ -0,0 +1,8 @@
+// https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101127
+// { dg-do compile { target i?86*-*-* x86_64-*-* } }
+// { dg-additional-options "-mavx" }
+
+import gcc.builtins;
+
+static assert(__traits(compiles, __builtin_ia32_andps256));
+static assert(__traits(compiles, __builtin_ia32_pmulhrsw128));
diff --git a/gcc/testsuite/gdc.dg/pr101127b.d b/gcc/testsuite/gdc.dg/pr101127b.d
new file mode 100644
index 000..b462d75c424
--- /dev/null
+++ b/gcc/testsuite/gdc.dg/pr101127b.d
@@ -0,0 +1,7 @@
+// https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101127
+// { dg-do compile { target i?86*-*-* x86_64-*-* } }
+
+import gcc.builtins;
+
+static assert(!__traits(compiles, __builtin_ia32_andps256));
+static assert(!__traits(compiles, 

[committed] d: Change in DotTemplateExp type semantics leading to regression (PR101619)

2021-07-28 Thread Iain Buclaw via Gcc-patches
Hi,

This patch fixes a regression introduced by PR100999.

By giving dot templates a type, meant that properry resolving silently
started passing for code that should never have passed.  The simple fix
is to provide implementations for checkType and checkValue that give an
error about dot templates having neither a value nor type.

Bootstrapped and regression tested on x86_64-linux-gnu/-m32/-mx32.
Committed to mainline, and backported to the gcc-10 and gcc-11 release
branches.

Regards,
Iain.

---
gcc/d/ChangeLog:

PR d/101619
* dmd/MERGE: Merge upstream dmd 1d8386a63.
---
 gcc/d/dmd/MERGE   |  2 +-
 gcc/d/dmd/expression.c| 12 ++
 gcc/d/dmd/expression.h|  2 ++
 gcc/testsuite/gdc.test/compilable/test22133.d | 16 +
 .../gdc.test/fail_compilation/fail22133.d | 24 +++
 .../gdc.test/fail_compilation/fail7424b.d |  2 +-
 .../gdc.test/fail_compilation/fail7424c.d |  2 +-
 .../gdc.test/fail_compilation/fail7424d.d |  2 +-
 .../gdc.test/fail_compilation/fail7424e.d |  2 +-
 .../gdc.test/fail_compilation/fail7424f.d |  2 +-
 .../gdc.test/fail_compilation/fail7424g.d |  2 +-
 .../gdc.test/fail_compilation/fail7424h.d |  2 +-
 .../gdc.test/fail_compilation/fail7424i.d |  2 +-
 13 files changed, 63 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/gdc.test/compilable/test22133.d
 create mode 100644 gcc/testsuite/gdc.test/fail_compilation/fail22133.d

diff --git a/gcc/d/dmd/MERGE b/gcc/d/dmd/MERGE
index d20785d9126..127f9f8aa86 100644
--- a/gcc/d/dmd/MERGE
+++ b/gcc/d/dmd/MERGE
@@ -1,4 +1,4 @@
-7a3808254878df8cb70a055bea58afc79187b778
+1d8386a63d412c9e77728b0b965025ac4dd40b75
 
 The first line of this file holds the git revision number of the last
 merge done from the dlang/dmd repository.
diff --git a/gcc/d/dmd/expression.c b/gcc/d/dmd/expression.c
index 153819aa172..7166f972424 100644
--- a/gcc/d/dmd/expression.c
+++ b/gcc/d/dmd/expression.c
@@ -4200,6 +4200,18 @@ DotTemplateExp::DotTemplateExp(Loc loc, Expression *e, 
TemplateDeclaration *td)
 this->td = td;
 }
 
+bool DotTemplateExp::checkType()
+{
+error("%s %s has no type", td->kind(), toChars());
+return true;
+}
+
+bool DotTemplateExp::checkValue()
+{
+error("%s %s has no value", td->kind(), toChars());
+return true;
+}
+
 //
 
 DotVarExp::DotVarExp(Loc loc, Expression *e, Declaration *var, bool 
hasOverloads)
diff --git a/gcc/d/dmd/expression.h b/gcc/d/dmd/expression.h
index 2ed8fac373e..9413ad9a931 100644
--- a/gcc/d/dmd/expression.h
+++ b/gcc/d/dmd/expression.h
@@ -930,6 +930,8 @@ public:
 TemplateDeclaration *td;
 
 DotTemplateExp(Loc loc, Expression *e, TemplateDeclaration *td);
+bool checkType();
+bool checkValue();
 void accept(Visitor *v) { v->visit(this); }
 };
 
diff --git a/gcc/testsuite/gdc.test/compilable/test22133.d 
b/gcc/testsuite/gdc.test/compilable/test22133.d
new file mode 100644
index 000..aff762c7180
--- /dev/null
+++ b/gcc/testsuite/gdc.test/compilable/test22133.d
@@ -0,0 +1,16 @@
+// https://issues.dlang.org/show_bug.cgi?id=22133
+
+struct Slice
+{
+bool empty() const;
+int front() const;
+void popFront()() // note: requires a mutable Slice
+{}
+}
+
+enum isInputRange1(R) = is(typeof((R r) => r.popFront));
+enum isInputRange2(R) = __traits(compiles, (R r) => r.popFront);
+static assert(isInputRange1!(  Slice) == true);
+static assert(isInputRange1!(const Slice) == false);
+static assert(isInputRange2!(  Slice) == true);
+static assert(isInputRange2!(const Slice) == false);
diff --git a/gcc/testsuite/gdc.test/fail_compilation/fail22133.d 
b/gcc/testsuite/gdc.test/fail_compilation/fail22133.d
new file mode 100644
index 000..338d96dc7e1
--- /dev/null
+++ b/gcc/testsuite/gdc.test/fail_compilation/fail22133.d
@@ -0,0 +1,24 @@
+// https://issues.dlang.org/show_bug.cgi?id=22133
+/*
+TEST_OUTPUT
+---
+fail_compilation/fail22133.d(16): Error: `s.popFront()()` has no effect
+fail_compilation/fail22133.d(17): Error: template `s.popFront()()` has no type
+---
+*/
+struct Slice
+{
+void popFront()() {}
+}
+
+auto fail22133(const Slice s)
+{
+s.popFront;
+return s.popFront;
+}
+
+auto ok22133(Slice s)
+{
+s.popFront;
+return s.popFront;
+}
diff --git a/gcc/testsuite/gdc.test/fail_compilation/fail7424b.d 
b/gcc/testsuite/gdc.test/fail_compilation/fail7424b.d
index 737958ca6a3..c3fc3116939 100644
--- a/gcc/testsuite/gdc.test/fail_compilation/fail7424b.d
+++ b/gcc/testsuite/gdc.test/fail_compilation/fail7424b.d
@@ -1,7 +1,7 @@
 /*
 TEST_OUTPUT:
 ---
-fail_compilation/fail7424b.d(10): Error: expression `this.g()()` is `void` and 
has no value
+fail_compilation/fail7424b.d(10): Error: template `this.g()()` has no value
 ---
 */
 struct S7424b
diff --git a/gcc/testsuite/gdc.test/fail_compilation/fail7424c.d 

[Patch] gfortran.dg/dg.exp: Add libgfortran as -I flag for ISO*.h [PR101305] (was: [PATCH 3/3] [PR libfortran/101305] Fix ISO_Fortran_binding.h paths in gfortran testsuite)

2021-07-28 Thread Tobias Burnus

Hi Sandra, hi all,

On 28.07.21 06:36, Sandra Loosemore wrote:

On 7/26/21 2:13 PM, Sandra Loosemore wrote:

On 7/26/21 3:45 AM, Tobias Burnus wrote:

PS: Still, it would be nice if the proper multi-lib ISO*.h could be
found;


(Example for x86-64-gnu-linux with 32bit and 64bit support)

Namely,
  x86_64-pc-linux-gnu/libgfortran/ISO_Fortran_binding.h
  x86_64-pc-linux-gnu/32/libgfortran/ISO_Fortran_binding.h
exist and they are different.

GCC finds the correct header, when running:
  make check-fortran RUNTESTFLAGS="--target_board=unix\{-m32\}"
which runs gfortran with:
  -B.../x86_64-pc-linux-gnu/32/libgfortran/

Likewise, with "-m32" replaced by "-m64" or "",
it works and gfortran is run with
  -B.../x86_64-pc-linux-gnu/./libgfortran/


But when running both at the same time, i.e.
  make check-fortran RUNTESTFLAGS="--target_board=unix\{,-m32\}"
(note the ',' before '-m32'), gfortran is run for
both "" (= -m64) and "-m32" with:
  -B.../x86_64-pc-linux-gnu/./libgfortran/
That's fine for -m64 but for -m32 it finds the wrong
ISO_Fortran_binding.h  file.


Solution:
Add a "-I" with the proper path to find the right *.h file.
'strace' confirms that the -I include path is searched before the
path provided by "-B".

[Note: I only did in-tree testing with that patch so far.]

Comments to the attached patch? Does it look OK?


 * * *


Unfortunately, I could not get this to work.


I think the attached version does work :-)


For installed-tree testing, this resulted in diagnostics about a
nonexistent directory on the include path.  In my i686-pc-linux-gnu
build I was having other problems when I tried build-tree testing using

make check-gfortran RUNTESTFLAGS="--target-board=localhost/m64"


I have an x86-64-gnu-linux build and there I use
"--target_board=unix\{,-m32\}",
which seems to work fine and runs it first with "" (= -m64) and then
with -m32.
(See above.)

I also do not see any issues with xfailing.


BTW, I can't find any documentation for what get_multilibs is supposed
to do.  It seems to be part of Dejagnu itself rather than the gcc test
support?


Yes, but the documentation is, well, short:
https://www.gnu.org/software/dejagnu/manual/get_005fmultilibs-procedure.html

In the DejaGNU source code, there is a better description (comment before the 
function):
Cf. 
https://git.savannah.gnu.org/gitweb/?p=dejagnu.git;a=blob;f=lib/libgloss.exp;;hb=HEAD#l389

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
gfortran.dg/dg.exp: Add libgfortran as -I flag for ISO*.h [PR101305]

gcc/testsuite/
	PR libfortran/101305
	* gfortran.dg/dg.exp: Add '-I /libgfortran'
	compile flag.

diff --git a/gcc/testsuite/gfortran.dg/dg.exp b/gcc/testsuite/gfortran.dg/dg.exp
index 06689813d07..cb48ed3e7fb 100644
--- a/gcc/testsuite/gfortran.dg/dg.exp
+++ b/gcc/testsuite/gfortran.dg/dg.exp
@@ -28,6 +28,17 @@ if ![info exists DEFAULT_FFLAGS] then {
 # Initialize `dg'.
 dg-init
 
+# Flags for finding libgfortran ISO*.h files.
+if [info exists TOOL_OPTIONS] {
+   set specpath [get_multilibs ${TOOL_OPTIONS}]
+} else {
+   set specpath [get_multilibs]
+}
+set include_options "-I$specpath/libgfortran"
+if [file exists $specpath/libgfortran ] {
+set include_options "-I$specpath/libgfortran" 
+}
+
 global gfortran_test_path
 global gfortran_aux_module_flags
 set gfortran_test_path $srcdir/$subdir
@@ -55,10 +66,10 @@ proc dg-compile-aux-modules { args } {
 
 # Main loop.
 gfortran-dg-runtest [lsort \
-   [glob -nocomplain $srcdir/$subdir/*.\[fF\]{,90,95,03,08} ] ] "" $DEFAULT_FFLAGS
+   [glob -nocomplain $srcdir/$subdir/*.\[fF\]{,90,95,03,08} ] ] "" "$include_options $DEFAULT_FFLAGS"
 
 gfortran-dg-runtest [lsort \
-   [glob -nocomplain $srcdir/$subdir/g77/*.\[fF\] ] ] "" $DEFAULT_FFLAGS
+   [glob -nocomplain $srcdir/$subdir/g77/*.\[fF\] ] ] "" "$include_options $DEFAULT_FFLAGS"
 
 
 # All done.


Re: [PATCH 0/13] v2 warning control by group and location (PR 74765)

2021-07-28 Thread Andrew Burgess
* Martin Sebor via Gcc-patches  [2021-07-19 09:08:35 
-0600]:

> On 7/17/21 2:36 PM, Jan-Benedict Glaw wrote:
> > Hi Martin!
> > 
> > On Fri, 2021-06-04 15:27:04 -0600, Martin Sebor  wrote:
> > > This is a revised patch series to add warning control by group and
> > > location, updated based on feedback on the initial series.
> > [...]
> > 
> > My automated checking (in this case: Using Debian's "gcc-snapshot"
> > package) indicates that between versions 1:20210527-1 and
> > 1:20210630-1, building GDB breaks. Your patch is a likely candidate.
> > It's a case where a method asks for a nonnull argument and later on
> > checks for NULLness again. The build log is currently available at
> > (http://wolf.lug-owl.de:8080/jobs/gdb-vax-linux/5), though obviously
> > breaks for any target:
> > 
> > configure --target=vax-linux --prefix=/tmp/gdb-vax-linux
> > make all-gdb
> > 
> > [...]
> > [all 2021-07-16 19:19:25]   CXXcompile/compile.o
> > [all 2021-07-16 19:19:30] In file included from 
> > ./../gdbsupport/common-defs.h:126,
> > [all 2021-07-16 19:19:30]  from ./defs.h:28,
> > [all 2021-07-16 19:19:30]  from compile/compile.c:20:
> > [all 2021-07-16 19:19:30] ./../gdbsupport/gdb_unlinker.h: In constructor 
> > 'gdb::unlinker::unlinker(const char*)':
> > [all 2021-07-16 19:19:30] ./../gdbsupport/gdb_assert.h:35:4: error: 
> > 'nonnull' argument 'filename' compared to NULL [-Werror=nonnull-compare]
> > [all 2021-07-16 19:19:30]35 |   ((void) ((expr) ? 0 :   
> > \
> > [all 2021-07-16 19:19:30]   |   
> > ~^~~~
> > [all 2021-07-16 19:19:30]36 |(gdb_assert_fail (#expr, 
> > __FILE__, __LINE__, FUNCTION_NAME), 0)))
> > [all 2021-07-16 19:19:30]   |
> > ~
> > [all 2021-07-16 19:19:30] ./../gdbsupport/gdb_unlinker.h:38:5: note: in 
> > expansion of macro 'gdb_assert'
> > [all 2021-07-16 19:19:30]38 | gdb_assert (filename != NULL);
> > [all 2021-07-16 19:19:30]   | ^~
> > [all 2021-07-16 19:19:31] cc1plus: all warnings being treated as errors
> > [all 2021-07-16 19:19:31] make[1]: *** [Makefile:1641: compile/compile.o] 
> > Error 1
> > [all 2021-07-16 19:19:31] make[1]: Leaving directory 
> > '/var/lib/laminar/run/gdb-vax-linux/5/binutils-gdb/gdb'
> > [all 2021-07-16 19:19:31] make: *** [Makefile:11410: all-gdb] Error 2
> > 
> > 
> > Code is this:
> > 
> >   31 class unlinker
> >   32 {
> >   33  public:
> >   34
> >   35   unlinker (const char *filename) ATTRIBUTE_NONNULL (2)
> >   36 : m_filename (filename)
> >   37   {
> >   38 gdb_assert (filename != NULL);
> >   39   }
> > 
> > I'm quite undecided whether this is bad behavior of GCC or bad coding
> > style in Binutils/GDB, or both.
> 
> A warning should be expected in this case.  Before the recent GCC
> change it was inadvertently suppressed in gdb_assert macros by its
> operand being enclosed in parentheses.

This issue was just posted to the GDB list, and I wanted to clarify my
understanding a bit.

I believe that (at least by default) adding the nonnull attribute
allows GCC to assume (in the above case) that filename will not be
NULL and generate code accordingly.

Additionally, passing an explicit NULL (i.e. 'unlinker obj (NULL)')
would cause a compile time error.

But, there's nothing to actually stop a NULL being passed due to, say,
a logic bug in the program.  So, something like this would compile
fine:

  extern const char *ptr;
  unlinker obj (ptr);

And in a separate compilation unit:

  const char *ptr = NULL;

Obviously, the run time behaviour of such a program would be
undefined.

Given the above then, it doesn't seem crazy to want to do something
like the above, that is, add an assert to catch a logic bug in the
program.

Is there an approved mechanism through which I can tell GCC that I
really do want to do a comparison to NULL, without any warning, and
without the check being optimised out?

Thanks,
Andrew


Re: [RFC] more no-wrap conditions for IV analyzing and scev

2021-07-28 Thread Richard Biener
On Fri, 23 Jul 2021, guojiufu wrote:

> On 2021-06-21 20:36, Richard Biener wrote:
> > On Mon, 21 Jun 2021, guojiufu wrote:
> > 
> >> On 2021-06-21 14:19, guojiufu via Gcc-patches wrote:
> >> > On 2021-06-09 19:18, guojiufu wrote:
> >> >> On 2021-06-09 17:42, guojiufu via Gcc-patches wrote:
> >> >>> On 2021-06-08 18:13, Richard Biener wrote:
> >>  On Fri, 4 Jun 2021, Jiufu Guo wrote:
> >> 
> >> >>> cut...
> >> > cut...
> >> >>
> >> 
> >> Besides the method in the previous mails, 
> >> I’m thinking of another way to split loops:
> >> 
> >> foo (int *a, int *b, unsigned k, unsigned n)
> >> {   
> >>  while (++k != n)
> >>    a[k] = b[k] + 1;   
> >> } 
> >> 
> >> We may split it into:
> >> if (k >> {
> >>   while (++k < n)  //loop1
> >>    a[k] = b[k] + 1;   
> >> }
> >> else
> >> {
> >>  while (++k != n) //loop2
> >>    a[k] = b[k] + 1;  
> >> }
> >> 
> >> In most cases, loop1 would be hit, the overhead of this method is only
> >> checking “if (k >> which would be smaller than the previous method.
> > 
> > That would be your original approach of versioning the loop.  I think
> > I suggested that for this scalar evolution and dataref analysis should
> > be enhanced to build up conditions under which IV evolutions are
> > affine (non-wrapping) and the versioning code in actual transforms
> > should then do the appropriate versioning (like the vectorizer already
> > does for niter analysis ->assumptions for example).
> 
> Hi Richi,
> 
> Thanks for your suggestion!
> 
> The original idea was trying to cover cases like multi-exit loops, while it
> seems not benefited too much.  The method you said would help for common
> cases.
> 
> I'm thinking of the methods to implement this:
> During scev analyzing, add more possible wrap checking (especially for
> unsigned)
> for convert_affine_scev/scev_probably_wraps_p/chrec_convert_1;  introducing
> no_wrap_assumption for the conditions of no-wrapping on given chrec/iv.
> And using this assumption in simple_iv_with_niters/dr_analyze_innermost.
> 
> One question is, where is the best place to add this assumption?
> Is it a flexible idea to add no_wrap_assumption to affine_iv/loop, and
> set the assumption when scev checks wrap?

I'm not sure what you exactly mean here.  I was thinking of
making analyze_scalar_evolution return more than a plain
tree, for example by providing an alternate (optional) output,
for example a pointer to some new scev_info that could initially
be just

struct scev_info {
  tree assumptions;
};

when analyze_scalar_evolution recurses there are certain points
it can end up returning chrec_dont_know.  If we passed in the
alternate output there might be the possibility to add
to the assumption to make the result well-defined (and yes,
this extends to chrec_fold_* routines, mostly chrec_convert
I think).  Handled cases could be added piecewise, just passing
down the scev_info pointer will be intrusive initially.

For simple_iv there's already a struct we could add 'assumptions'
to (and maybe a flag to the API whether assumptions are allowed).

For DR analysis the same could be done.

Richard.

> Thanks for your suggestions!
> 
> BR.
> Jiufu
> 
> 
> > 
> > Richard.
> > 
> cut
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


Re: [PATCH] PR fortrsn/101564 - ICE in resolve_allocate_deallocate, at fortran/resolve.c:8169

2021-07-28 Thread Tobias Burnus

Hi Harald,

On 27.07.21 23:42, Harald Anlauf wrote:

This almost worked, needing only a restriction to %KIND and %LEN.
Note that %RE and %IM are usually definable.

Well spotted :-)

Regtested on x86_64-pc-linux-gnu.  OK?

LGTM - except [...] feel free add them and commit without further review.
[...]

I have added the updated "final" version of the patch to give
everybody another 24h to have a look, and will commit if nobody
complains.

LGTM - thanks again.


[...] with fixing a few issues on the way before Gerhard finds them...


:-)

Tobias


Fortran: ICE in resolve_allocate_deallocate for invalid STAT argument

gcc/fortran/ChangeLog:

  PR fortran/101564
  * expr.c (gfc_check_vardef_context): Add check for KIND and LEN
  parameter inquiries.
  * match.c (gfc_match): Fix comment for %v code.
  (gfc_match_allocate, gfc_match_deallocate): Replace use of %v code
  by %e in gfc_match to allow for function references as STAT and
  ERRMSG arguments.
  * resolve.c (resolve_allocate_deallocate): Avoid NULL pointer
  dereferences and shortcut for bad STAT and ERRMSG argument to
  (DE)ALLOCATE.  Remove bogus parts of checks for STAT and ERRMSG.

gcc/testsuite/ChangeLog:

  PR fortran/101564
  * gfortran.dg/allocate_stat_3.f90: New test.
  * gfortran.dg/allocate_stat.f90: Adjust error messages.
  * gfortran.dg/implicit_11.f90: Likewise.
  * gfortran.dg/inquiry_type_ref_3.f90: Likewise.

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


Re: [PATCH] ubsan: Fix ICEs with DECL_REGISTER tests [PR101624]

2021-07-28 Thread Richard Biener
On Wed, 28 Jul 2021, Jakub Jelinek wrote:

> Hi!
> 
> The following testcase ICEs, because the base is a CONST_DECL for
> the Fortran parameter, and ubsan/sanopt uses DECL_REGISTER macro on it.
> /* In VAR_DECL and PARM_DECL nodes, nonzero means declared `register'.  */
> #define DECL_REGISTER(NODE) (DECL_WRTL_CHECK (NODE)->decl_common.decl_flag_0)
> while CONST_DECL doesn't satisfy DECL_WRTL_CHECK.
> 
> The following patch checks explicitly for VAR_DECL/PARM_DECL/RESULT_DECL
> only before using DECL_REGISTER, assumes other decls aren't DECL_REGISTER.
> Not really sure about RESULT_DECL but it at least satisfies DECL_WRTL_CHECK...
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk and
> backports?

OK.

> 2021-07-28  Jakub Jelinek  
> 
>   PR middle-end/101624
>   * ubsan.c (maybe_instrument_pointer_overflow,
>   instrument_object_size): Only test DECL_REGISTER on VAR_DECLs,
>   PARM_DECLs or RESULT_DECLs.
>   * sanopt.c (maybe_optimize_ubsan_ptr_ifn): Likewise.
> 
>   * gfortran.dg/ubsan/ubsan.exp: New file.
>   * gfortran.dg/ubsan/pr101624.f90: New test.
> 
> --- gcc/ubsan.c.jj2021-05-10 12:22:30.425451947 +0200
> +++ gcc/ubsan.c   2021-07-27 19:18:05.926969704 +0200
> @@ -1443,7 +1443,10 @@ maybe_instrument_pointer_overflow (gimpl
>tree base;
>if (decl_p)
>  {
> -  if (DECL_REGISTER (inner))
> +  if ((VAR_P (inner)
> +|| TREE_CODE (inner) == PARM_DECL
> +|| TREE_CODE (inner) == RESULT_DECL)
> +   && DECL_REGISTER (inner))
>   return;
>base = inner;
>/* If BASE is a fixed size automatic variable or
> @@ -2115,7 +2118,10 @@ instrument_object_size (gimple_stmt_iter
>tree base;
>if (decl_p)
>  {
> -  if (DECL_REGISTER (inner))
> +  if ((VAR_P (inner)
> +|| TREE_CODE (inner) == PARM_DECL
> +|| TREE_CODE (inner) == RESULT_DECL)
> +   && DECL_REGISTER (inner))
>   return;
>base = inner;
>  }
> --- gcc/sanopt.c.jj   2021-06-14 12:27:18.605410685 +0200
> +++ gcc/sanopt.c  2021-07-27 19:16:45.667035649 +0200
> @@ -492,7 +492,10 @@ maybe_optimize_ubsan_ptr_ifn (sanopt_ctx
> , , );
>if ((offset == NULL_TREE || TREE_CODE (offset) == INTEGER_CST)
> && DECL_P (base)
> -   && !DECL_REGISTER (base)
> +   && ((!VAR_P (base)
> +&& TREE_CODE (base) != PARM_DECL
> +&& TREE_CODE (base) != RESULT_DECL)
> +   || !DECL_REGISTER (base))
> && pbitpos.is_constant ())
>   {
> offset_int expr_offset;
> --- gcc/testsuite/gfortran.dg/ubsan/ubsan.exp.jj  2021-07-27 
> 19:59:24.889038766 +0200
> +++ gcc/testsuite/gfortran.dg/ubsan/ubsan.exp 2021-07-27 20:00:18.538326168 
> +0200
> @@ -0,0 +1,38 @@
> +# Copyright (C) 2021 Free Software Foundation, Inc.
> +#
> +# This file is part of GCC.
> +#
> +# GCC is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3, or (at your option)
> +# any later version.
> +#
> +# GCC is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with GCC; see the file COPYING3.  If not see
> +# .
> +
> +# GCC testsuite for gfortran that checks undefined behavior sanitizer.
> +
> +# Load support procs.
> +load_lib gfortran-dg.exp
> +load_lib ubsan-dg.exp
> +
> +
> +# Initialize `dg'.
> +dg-init
> +ubsan_init
> +
> +# Main loop.
> +if [check_effective_target_fsanitize_undefined] {
> +gfortran-dg-runtest [lsort \
> +   [glob -nocomplain $srcdir/$subdir/*.\[fF\]{,90,95,03,08} ] ] "" ""
> +}
> +
> +# All done.
> +ubsan_finish
> +dg-finish
> --- gcc/testsuite/gfortran.dg/ubsan/pr101624.f90.jj   2021-07-27 
> 19:56:51.831071747 +0200
> +++ gcc/testsuite/gfortran.dg/ubsan/pr101624.f90  2021-07-27 
> 19:59:14.634174975 +0200
> @@ -0,0 +1,13 @@
> +! PR middle-end/101624
> +! { dg-do compile }
> +! { dg-options "-O2 -fsanitize=undefined" }
> +
> +complex function foo (x)
> +  complex, intent(in) :: x
> +  foo = aimag (x)
> +end
> +program pr101624
> +  complex, parameter :: a = (0.0, 1.0)
> +  complex :: b, foo
> +  b = foo (a)
> +end
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


[PATCH] analyzer: Handle strdup builtins

2021-07-28 Thread Siddhesh Poyarekar
Consolidate allocator builtin handling and add support for
__builtin_strdup and __builtin_strndup.

gcc/analyzer/ChangeLog:
* analyzer.cc (is_named_call_p, is_std_named_call_p): Make
first argument a const_tree.
* analyzer.h (is_named_call_p, -s_std_named_call_p): Likewise.
* sm-malloc.cc (known_allocator_p): New function.
(malloc_state_machine::on_stmt): Use it.

gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/strdup-1.c (test_4, test_5, test_6): New
tests.
---
 gcc/analyzer/analyzer.cc |  8 ++---
 gcc/analyzer/analyzer.h  |  8 ++---
 gcc/analyzer/sm-malloc.cc| 41 +++-
 gcc/testsuite/gcc.dg/analyzer/strdup-1.c | 19 +++
 4 files changed, 60 insertions(+), 16 deletions(-)

diff --git a/gcc/analyzer/analyzer.cc b/gcc/analyzer/analyzer.cc
index ddace9a0c32..b845b86cfe1 100644
--- a/gcc/analyzer/analyzer.cc
+++ b/gcc/analyzer/analyzer.cc
@@ -240,7 +240,7 @@ is_special_named_call_p (const gcall *call, const char 
*funcname,
Compare with special_function_p in calls.c.  */
 
 bool
-is_named_call_p (tree fndecl, const char *funcname)
+is_named_call_p (const_tree fndecl, const char *funcname)
 {
   gcc_assert (fndecl);
   gcc_assert (funcname);
@@ -292,7 +292,7 @@ is_std_function_p (const_tree fndecl)
 /* Like is_named_call_p, but look for std::FUNCNAME.  */
 
 bool
-is_std_named_call_p (tree fndecl, const char *funcname)
+is_std_named_call_p (const_tree fndecl, const char *funcname)
 {
   gcc_assert (fndecl);
   gcc_assert (funcname);
@@ -314,7 +314,7 @@ is_std_named_call_p (tree fndecl, const char *funcname)
arguments?  */
 
 bool
-is_named_call_p (tree fndecl, const char *funcname,
+is_named_call_p (const_tree fndecl, const char *funcname,
 const gcall *call, unsigned int num_args)
 {
   gcc_assert (fndecl);
@@ -332,7 +332,7 @@ is_named_call_p (tree fndecl, const char *funcname,
 /* Like is_named_call_p, but check for std::FUNCNAME.  */
 
 bool
-is_std_named_call_p (tree fndecl, const char *funcname,
+is_std_named_call_p (const_tree fndecl, const char *funcname,
 const gcall *call, unsigned int num_args)
 {
   gcc_assert (fndecl);
diff --git a/gcc/analyzer/analyzer.h b/gcc/analyzer/analyzer.h
index 90143d9aba2..8de5d60821f 100644
--- a/gcc/analyzer/analyzer.h
+++ b/gcc/analyzer/analyzer.h
@@ -220,11 +220,11 @@ enum access_direction
 
 extern bool is_special_named_call_p (const gcall *call, const char *funcname,
 unsigned int num_args);
-extern bool is_named_call_p (tree fndecl, const char *funcname);
-extern bool is_named_call_p (tree fndecl, const char *funcname,
+extern bool is_named_call_p (const_tree fndecl, const char *funcname);
+extern bool is_named_call_p (const_tree fndecl, const char *funcname,
 const gcall *call, unsigned int num_args);
-extern bool is_std_named_call_p (tree fndecl, const char *funcname);
-extern bool is_std_named_call_p (tree fndecl, const char *funcname,
+extern bool is_std_named_call_p (const_tree fndecl, const char *funcname);
+extern bool is_std_named_call_p (const_tree fndecl, const char *funcname,
 const gcall *call, unsigned int num_args);
 extern bool is_setjmp_call_p (const gcall *call);
 extern bool is_longjmp_call_p (const gcall *call);
diff --git a/gcc/analyzer/sm-malloc.cc b/gcc/analyzer/sm-malloc.cc
index 1d69d57df0e..4f07d1f9257 100644
--- a/gcc/analyzer/sm-malloc.cc
+++ b/gcc/analyzer/sm-malloc.cc
@@ -1526,6 +1526,38 @@ malloc_state_machine::get_or_create_deallocator (tree 
deallocator_fndecl)
   return d;
 }
 
+/* Try to identify the function declaration either by name or as a known malloc
+   builtin.  */
+
+static bool
+known_allocator_p (const_tree fndecl, const gcall *call)
+{
+  /* Either it is a function we know by name and number of arguments... */
+  if (is_named_call_p (fndecl, "malloc", call, 1)
+  || is_named_call_p (fndecl, "calloc", call, 2)
+  || is_std_named_call_p (fndecl, "malloc", call, 1)
+  || is_std_named_call_p (fndecl, "calloc", call, 2)
+  || is_named_call_p (fndecl, "strdup", call, 1)
+  || is_named_call_p (fndecl, "strndup", call, 2))
+return true;
+
+  /* ... or it is a builtin allocator that allocates objects freed with
+ __builtin_free.  */
+  if (fndecl_built_in_p (fndecl))
+switch (DECL_FUNCTION_CODE (fndecl))
+  {
+  case BUILT_IN_MALLOC:
+  case BUILT_IN_CALLOC:
+  case BUILT_IN_STRDUP:
+  case BUILT_IN_STRNDUP:
+   return true;
+  default:
+   break;
+  }
+
+  return false;
+}
+
 /* Implementation of state_machine::on_stmt vfunc for malloc_state_machine.  */
 
 bool
@@ -1536,14 +1568,7 @@ malloc_state_machine::on_stmt (sm_context *sm_ctxt,
   if (const gcall *call = dyn_cast  (stmt))
 if (tree callee_fndecl = sm_ctxt->get_fndecl_for_call (call))
   {
-   if (is_named_call_p 

Re: [PATCH] match.pd: Fix up recent __builtin_bswap16 simplifications [PR101642]

2021-07-28 Thread Richard Biener
On Wed, 28 Jul 2021, Jakub Jelinek wrote:

> Hi!
> 
> The following testcase ICEs.  The problem is that for __builtin_bswap16
> (and only that, others are fine) the argument of the builtin is promoted
> to int while the patterns assume it is not and is the same as that of
> the return type.
> For the bswap simplifications before these new ones it just means we
> fail to optimize stuff like __builtin_bswap16 (__builtin_bswap16 (x))
> because there are casts in between, but the last one, equality comparison
> of __builtin_bswap16 with integer constant results in ICE, because
> we create comparison with incompatible types of the operands, and the
> other might be fine because usually we bit and the operand before promoting,
> but I think it is too dangerous to rely on it, one day we find out that
> because it is operand to such a built in, we can throw away any changes
> that affect the upper bits and all of sudden it would misbehave.
> 
> So, this patch introduces converts that shouldn't do anything for
> bswap{32,64,128} and should fix these issues for bswap16.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Richard.

> 2021-07-28  Jakub Jelinek  
> 
>   PR middle-end/101642
>   * match.pd (bswap16 (x) == bswap16 (y)): Cast both operands
>   to type of bswap16 for comparison.
>   (bswap16 (x) == cst): Cast bswap16 operand to type of cst.
> 
>   * gcc.c-torture/compile/pr101642.c: New test.
> 
> --- gcc/match.pd.jj   2021-07-27 09:47:44.0 +0200
> +++ gcc/match.pd  2021-07-27 16:16:19.867757153 +0200
> @@ -3641,11 +3641,13 @@ (define_operator_list COND_TERNARY
> (bitop @0 (bswap @1
>   (for cmp (eq ne)
>(simplify
> -   (cmp (bswap @0) (bswap @1))
> -   (cmp @0 @1))
> +   (cmp (bswap@2 @0) (bswap @1))
> +   (with { tree ctype = TREE_TYPE (@2); }
> +(cmp (convert:ctype @0) (convert:ctype @1
>(simplify
> (cmp (bswap @0) INTEGER_CST@1)
> -   (cmp @0 (bswap @1
> +   (with { tree ctype = TREE_TYPE (@1); }
> +(cmp (convert:ctype @0) (bswap @1)
>   /* (bswap(x) >> C1) & C2 can sometimes be simplified to (x >> C3) & C2.  */
>   (simplify
>(bit_and (convert1? (rshift@0 (convert2? (bswap@4 @1)) INTEGER_CST@2))
> --- gcc/testsuite/gcc.c-torture/compile/pr101642.c.jj 2021-07-27 
> 16:34:44.910039793 +0200
> +++ gcc/testsuite/gcc.c-torture/compile/pr101642.c2021-07-27 
> 16:32:24.661907592 +0200
> @@ -0,0 +1,17 @@
> +/* PR middle-end/101642 */
> +
> +int x;
> +
> +unsigned short
> +foo (void)
> +{
> +  return __builtin_bswap16 (x) ? : 0;
> +}
> +
> +int
> +bar (int x, int y)
> +{
> +  unsigned short a = __builtin_bswap16 ((unsigned short) x);
> +  unsigned short b = __builtin_bswap16 ((unsigned short) y);
> +  return a == b;
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


Re: [PATCH] c/101512 - fix missing address-taking in c_common_mark_addressable_vec

2021-07-28 Thread Richard Biener
On Wed, 21 Jul 2021, Jakub Jelinek wrote:

> On Wed, Jul 21, 2021 at 10:06:51AM +0200, Richard Biener wrote:
> > c_common_mark_addressable_vec fails to look through C_MAYBE_CONST_EXPR
> > in the case it isn't at the toplevel.
> > 
> > Bootstrapped and tested on x86_64-unknown-linux-gnu, OK?
> > 
> > Thanks,
> > Richard.
> > 
> > 2021-07-21  Richard Biener  
> > 
> > PR c/101512
> > gcc/c-family/
> > * c-common.c (c_common_mark_addressable_vec): Look through
> > C_MAYBE_CONST_EXPR even if not at the toplevel.
> > 
> > * gcc.dg/torture/pr101512.c: New testcase.
> 
> I wonder if instead when trying to wrap
> C_MAYBE_CONST_EXPR into a VIEW_CONVERT_EXPR we shouldn't be
> removing that C_MAYBE_CONST_EXPR and perhaps adding it around the
> VIEW_CONVERT_EXPR.  E.g. various routines in c/c-typeck.c like
> build_unary_op remember int_operands, remove_c_maybe_const_expr
> and at the end note_integer_operands.
> 
> If Joseph thinks it is ok to have C_MAYBE_CONST_EXPR inside of
> VCE, then the patch looks good to me.

Joseph - any comments?  The above mentioned issue is pre-existent,
EXPR_INT_CONST_OPERANDS should be false on vector types though,
so eventually the C_MAYBE_CONST_EXPR is entirely pointless,
but it's unconditionally built in build_compound_literal but
C_MAYBE_CONST_EXPR_NON_CONST is set there.  The VIEW_CONVERT_EXPR
is built via convert(), the types have the same main variant,
just one is not named.  But nothing in convert() seems to care
about C_MAYBE_CONST_EXPR.

Thanks,
Richard.


Re: [EXTERNAL] Re: [PATCH] tree-optimization: Optimize division followed by multiply [PR95176]

2021-07-28 Thread Richard Biener via Gcc-patches
On Tue, Jun 29, 2021 at 1:10 AM Victor Tong  wrote:
>
> Thanks Richard and Marc.
>
> I wrote the following test case to compare the outputs of fn1() and 
> fn1NoOpt() below with my extra pattern being applied. I tested the two 
> functions with all of the integers from INT_MIN to INT_MAX.
>
> long
> fn1 (int x)
> {
>   return 42L - (long)(42 - x);
> }
>
> #pragma GCC push_options
> #pragma GCC optimize ("O0")
> long
> fn1NoOpt (int x)
> {
>   volatile int y = (42 - x);
>   return 42L - (long)y;
> }
> #pragma GCC pop_options
>
> int main ()
> {
> for (long i=INT_MIN; i<=INT_MAX;i++)
> {
> auto valNoOpt = fn1NoOpt(i);
> auto valOpt = fn1(i);
> if (valNoOpt != valOpt)
> printf("valOpt=%ld, valNoOpt=%ld\n", valOpt, 
> valNoOpt);
> }
> return 0;
> }
>
> I saw that the return values of fn1() and fn1NoOpt() differed when the input 
> was between INT_MIN and INT_MIN+42 inclusive. When passing values in this 
> range to fn1NoOpt(), a signed overflow is triggered which causes the value to 
> differ (undefined behavior). This seems to go in line with what Marc 
> described and I think the transformation is correct in the scenario above. I 
> do think that type casts that result in truncation (i.e. from a higher 
> precision to a lower one) or with unsigned types will result in an incorrect 
> transformation so those scenarios need to be avoided.
>
> Given that the extra pattern I'm adding is taking advantage the undefined 
> behavior of signed integer overflow, I'm considering keeping the existing 
> nop_convert pattern in place and adding a new pattern to cover these new 
> cases. I'd also like to avoid touching nop_convert given that it's used in a 
> number of other patterns.
>
> This is the pattern I have currently:
>
>   (simplify
> (minus (convert1? @0) (convert2? (minus (convert3? @2) @1)))
> (if (operand_equal_p(@0, @2, 0)

The operand_equal_p should be reflected by using @0 in place of @2.

> && INTEGRAL_TYPE_P (type)
> && TYPE_OVERFLOW_UNDEFINED(type)
> && !TYPE_OVERFLOW_SANITIZED(type)
> && INTEGRAL_TYPE_P (TREE_TYPE(@1))
> && TYPE_OVERFLOW_UNDEFINED(TREE_TYPE(@1))
> && !TYPE_OVERFLOW_SANITIZED(TREE_TYPE(@1))
> && !TYPE_UNSIGNED (TREE_TYPE (@1))
> && !TYPE_UNSIGNED (type)

please group checks on common argument.  I think a single test
on INTEGRAL_TYPE_P (type) is enough, it could be ANY_INTEGRAL_TYPE_P
to include vector and complex types.

> && TYPE_PRECISION (TREE_TYPE (@1)) <= TYPE_PRECISION (type)
> && INTEGRAL_TYPE_P (TREE_TYPE(@0))
> && TYPE_OVERFLOW_UNDEFINED(TREE_TYPE(@0))

why is this testing TREE_TYPE (@0)?  The outer subtract is using 'type',
the inner subtract uses TREE_TYPE (@1) though you could place
a capture on the minus to make what you test more obvious, like

  (minus (convert1? @0) (convert2? (minus@3 (convert3? @2) @1)))

TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@3))

there's one set of checks too much I think.

> && !TYPE_OVERFLOW_SANITIZED(TREE_TYPE(@0))
> && !TYPE_UNSIGNED (TREE_TYPE (@0))

we only ever have TYPE_OVERFLOW_UNDEFINED on singed types so
the !TYPE_UNSIGNED checks are superfluous.

> && TYPE_PRECISION (TREE_TYPE (@0)) <= TYPE_PRECISION (type)
> && TREE_TYPE(@1) == TREE_TYPE(@2))

This check means that convert3 is never present since a MINUS requires
compatible types.

Sorry for the late reply.

Note the pattern will necessarily be partly redundant with the

  (simplify
   (minus (nop_convert1? (minus (nop_convert2? @0) @1)) @0)
   (if (!ANY_INTEGRAL_TYPE_P (type)
|| TYPE_OVERFLOW_WRAPS (type))
   (negate (view_convert @1))
   (view_convert (negate @1

pattern.  Once we'd "inline" nop_convert genmatch would complain
about this.

> (convert @1)))
>
> Is there a more concise/better way of writing the pattern? I was looking for 
> similar checks in match.pd and I couldn't find anything that I could leverage.
>
> I also kept my pattern to the specific scenario I'm seeing with the 
> regression to lower the risk of something breaking. I've limited @1 and @2 to 
> have the same type.
>
> I'm also in favor of adding/running computer verification to make sure the 
> transformation is legal. I've written some tests to verify that the pattern 
> is being applied in the right scenarios and not being applied in others, but 
> I think there are too many possibilities to manually write them all. Is there 
> anything in GCC that can be used to verify that match.pd transformations are 
> correct? I'm thinking of something like Alive 
> https://github.com/AliveToolkit/alive2.
>
> Thanks,
> Victor
>
>
>
> From: Richard Biener 
> Sent: Monday, June 21, 2021 12:08 AM
> To: Marc Glisse 
> Cc: Victor Tong ; gcc-patches@gcc.gnu.org 
> 
> Subject: Re: [EXTERNAL] Re: [PATCH] tree-optimization: Optimize division 
> followed by multiply [PR95176]
>
> On Sat, 

Re: [PATCH] tree-optimization/101186 - extend FRE with "equivalence map" for condition prediction

2021-07-28 Thread Richard Biener via Gcc-patches
On Sun, Jul 18, 2021 at 9:25 PM Di Zhao OS
 wrote:
>
>
> I tried to improve the patch following your advices and to catch more
> opportunities. Hope it'll be helpful.

Sorry for the late reply.

> On 6/24/21 8:29 AM, Richard Biener wrote:
> > On Thu, Jun 24, 2021 at 11:55 AM Di Zhao via Gcc-patches  > patc...@gcc.gnu.org> wrote:
> >
> > I have some reservations about extending the ad-hoc "predicated value" code.
> >
> > Some comments on the patch:
> >
> > +/* hashtable & helpers to record equivalences at given bb.  */
> > +
> > +typedef struct val_equiv_s
> > +{
> > +  val_equiv_s *next;
> > +  val_equiv_s *unwind_to;
> > +  hashval_t hashcode;
> > +  /* SSA name this val_equiv_s is associated with.  */
> > +  tree name;
> > +  /* RESULT in a vn_pval entry is SSA name of a equivalence.  */
> > +  vn_pval *values;
> > +} * val_equiv_t;
> >
> > all of this (and using a hashtable for recording) is IMHO a bit overkill.
> > Since you only ever record equivalences for values the more natural place to
> > hook those in is the vn_ssa_aux structure where we also record the 
> > availability
> > chain.
>
> I tried to store the equivalences in the vn_ssa_aux structure, but I didn't
> optimize the second case successfully: I need to record the equivalence
> of a PHI expression's result and arguments, but their value number results 
> will
> become VARYING first, so they won't be changed. Maybe I'm missing something, 
> or
> can I force change a VARYING result?

But VARYING still has a value-number - it's the result itself?

> Besides, the way currently used, equivalences only need to be "predictable"
> rather than available, maybe availability chains do not represent them very
> well?

Sure, they are a different beast - I'm only commenting on the place you store
them as being not too efficient.

> > There's little commentary in the new code, in particular function-level
> > comments are missing everywhere.
>
> Added more comments.
>
> > There's complexity issues, like I see val_equiv_insert has a "recurse"
> > feature but also find_predicated_value_by_equiv is quadratic in the number 
> > of
> > equivalences of the lhs/rhs.  Without knowing what the recursion on the
> > former is for - nothing tells me - I suspect either of both should be 
> > redundant.
>
> The intention was, given {A==B, B==C, X==Y, Y==Z} and a previous result of
> "C opcode Z", to find the result of "A opcode Y". I removed the "recurse"
> feature and modified the searching logic so solve the issue. Now a temporary
> hash_set is used to record the equivalences that are visited when searching.

OK, so you're covering transitivity at query time - that looks
expensive to me.  As
said I wonder if there's a more efficient way to store equivalences here.

> > You seem to record equivalences at possible use points which looks odd at 
> > best
> > - I'd expected equivalences being recorded at the same point we record
> > predicated values and for the current condition, not the one determining 
> > some
> > other predication.
> > What was the motivation to do it the way you do it?
>
> The purpose is to "bring down" what can be known from a previous basic-block
> that effectively dominates current block, but not actually does so (in the
> example it is because jump threading is hindered by a loop). For example in
> this case:
>
>   if (a != 0)
>   // Nothing useful can be recorded here, because this BB doesn't dominate
>   // the BB that we want to simplify.
>   c = b;
>   for (unsigned i = 0; i < c; i++)
> {
>   if (a != 0)  // The recording is triggered here.
>{
>  // c == b will be recorded here, so it can be used for 
> simplification.
>  // In gimple it is the equivalence of a PHI's result and argument.
>  if (i >= b)
>foo ();
>
> These requires finding a previous condition that is identical with current
> one, so it is convenient to do this in FRE. Besides, as FRE records derived
> predicate, so for relative conditions there also might be opportunities for
> optimization. In the new patch code this is included.

I still don't quite understand why you cannot record the c = b equivalence
when processing its block.  You'd record "c == b if a != 0" and later
you look for equivalences on b and see if they are valid at the use site?
That's how predicated values work.

I'd like to see this equivalence stuff more naturally integrated with the
value lattice, not a collection of bolted-on hashmaps.

> Besides, to find more opportunities, added a hashmap to store mappings from
> immediate dominators to basic-blocks with PHIs of interest.
>
> > Why is the code conditional on 'iterate'?
>
> I haven't worked it out to fit the non-iterate mode, so it now breaks the
> if_conversion pass. I think this is because some of the equivalence-recordings
> are too optimistic for non-iterate mode.

Huh.  The non-iterative mode should be easier to deal with, in fact if you
are running into correctness issues this 

Re: [PATCH] add access warning pass

2021-07-28 Thread Richard Biener via Gcc-patches
On Fri, Jul 16, 2021 at 12:42 AM Martin Sebor via Gcc-patches
 wrote:
>
> A number of access warnings as well as their supporting
> infrastructure (compute_objsize et al.) are implemented in
> builtins.{c,h} where they  (mostly) operate on trees and run
> just before RTL expansion.
>
> This setup may have made sense initially when the warnings were
> very simple and didn't perform any CFG analysis, but it's becoming
> a liability.  The code has grown both in size and in complexity,
> might need to examine the CFG to improve detection, and in some
> cases might achieve a better S/R ratio if run earlier.  Running
> the warning code on trees is also slower because it doesn't
> benefit from the SSA_NAME caching provided by the pointer_query
> class.  Finally, having the code there is also an impediment to
> maintainability as warnings and builtin expansion are unrelated
> to each other and contributors to one area shouldn't need to wade
> through unrelated code (similar for patch reviewers).
>
> The attached change introduces a new warning pass and a couple of
> new source and headers and, as the first step, moves the warning
> code from builtins.{c,h} there.  To keep the initial changes as
> simple as possible the pass only runs a subset of existing
> warnings: -Wfree-nonheap-object, -Wmismatched-dealloc, and
> -Wmismatched-new-delete.  The others (-Wstringop-overflow and
> -Wstringop-overread) still run on the tree representation and
> are still invoked from builtins.c or elsewhere.
>
> The changes have no functional impact either on codegen or on
> warnings.  I tested them on x86_64-linux.
>
> As the next step I plan to change the -Wstringop-overflow and
> -Wstringop-overread code to run on the GIMPLE IL in the new pass
> instead of on trees in builtins.c.

That's the maybe_warn_rdwr_sizes thing?

+  gimple *stmt = gsi_stmt (si);
+  if (!is_gimple_call (stmt))
+   continue;
+
+  check (as_a (stmt));


 if (gcall *call = dyn_cast  (gsi_stmt (si)))
   check (call);

might be more C++-ish.

The patch looks OK - I skimmed it as mostly moving things
around plus adding a new pass.

Thanks,
Richard.

> Martin
>
> PS The builtins.c diff produced by git diff was much bigger than
> the changes justify.  It seems that the code removal somehow
> confused it.  To make review easy I replaced it with a plain
> unified diff of builtins.c that doesn't suffer from the problem.


Re: [PATCH] wwwdocs: Clarify meaning of "not issued by" in bugs web page

2021-07-28 Thread Jonathan Wakely via Gcc-patches
On Tue, 27 Jul 2021 at 18:30, Martin Sebor  wrote:
>
> On 7/27/21 9:16 AM, Jonathan Wakely via Gcc-patches wrote:
> > Should we make this change?
> >
> > Firstly, these bullet points are full sentences and so should end with
> > a period (or smiley, in some cases).
>
> I'd expect that to be relatively uncontroversial ;)
>
> >
> > Secondly, releases are not issued by the GNU Project at all, they're
> > issued by the GCC release managers.
>
> I (and I suspect most users unfamiliar with the inner workings of
> the project) think of release managers as acting on behalf of
> the whole project, so even though they technically cut the release
> it's still put out by the project as a whole.

Which is why I didn't mention release managers in the actual patch. I
changed "GNU Project" to "GCC project", because the GNU Project does
not do our releases, we do (via the RMs).



>
> >
> > Finally, "releases or snapshots of GCC not issued by ..." has confused
> > at least one bug reporter, and I think saying "unofficial releases or
> > snapshots" makes it slightly clearer. Comparatively few users actually
> > use a self-built GCC based on official source tarballs, but that's OK.
> > Distro builds tend to be much closer to upstream these days, and we
> > rarely reject bug reports where the reporter is using a build from
> > Fedora, Ubuntu, Arch or whatever (unless it really is caused by a
> > downstream patch and doesn't reproduce with a gcc.gnu.org release).
> >
> > OK for wwwdocs?
>
> It makes sense to me.  I'd also correct the grammar in "report them
> to whoever" either by changing it "report them to whomever" or by
> rephrasing it (e.g., "report them to the provider of the release").
>
> Martin
>



[PATCH] ubsan: Fix ICEs with DECL_REGISTER tests [PR101624]

2021-07-28 Thread Jakub Jelinek via Gcc-patches
Hi!

The following testcase ICEs, because the base is a CONST_DECL for
the Fortran parameter, and ubsan/sanopt uses DECL_REGISTER macro on it.
/* In VAR_DECL and PARM_DECL nodes, nonzero means declared `register'.  */
#define DECL_REGISTER(NODE) (DECL_WRTL_CHECK (NODE)->decl_common.decl_flag_0)
while CONST_DECL doesn't satisfy DECL_WRTL_CHECK.

The following patch checks explicitly for VAR_DECL/PARM_DECL/RESULT_DECL
only before using DECL_REGISTER, assumes other decls aren't DECL_REGISTER.
Not really sure about RESULT_DECL but it at least satisfies DECL_WRTL_CHECK...

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk and
backports?

2021-07-28  Jakub Jelinek  

PR middle-end/101624
* ubsan.c (maybe_instrument_pointer_overflow,
instrument_object_size): Only test DECL_REGISTER on VAR_DECLs,
PARM_DECLs or RESULT_DECLs.
* sanopt.c (maybe_optimize_ubsan_ptr_ifn): Likewise.

* gfortran.dg/ubsan/ubsan.exp: New file.
* gfortran.dg/ubsan/pr101624.f90: New test.

--- gcc/ubsan.c.jj  2021-05-10 12:22:30.425451947 +0200
+++ gcc/ubsan.c 2021-07-27 19:18:05.926969704 +0200
@@ -1443,7 +1443,10 @@ maybe_instrument_pointer_overflow (gimpl
   tree base;
   if (decl_p)
 {
-  if (DECL_REGISTER (inner))
+  if ((VAR_P (inner)
+  || TREE_CODE (inner) == PARM_DECL
+  || TREE_CODE (inner) == RESULT_DECL)
+ && DECL_REGISTER (inner))
return;
   base = inner;
   /* If BASE is a fixed size automatic variable or
@@ -2115,7 +2118,10 @@ instrument_object_size (gimple_stmt_iter
   tree base;
   if (decl_p)
 {
-  if (DECL_REGISTER (inner))
+  if ((VAR_P (inner)
+  || TREE_CODE (inner) == PARM_DECL
+  || TREE_CODE (inner) == RESULT_DECL)
+ && DECL_REGISTER (inner))
return;
   base = inner;
 }
--- gcc/sanopt.c.jj 2021-06-14 12:27:18.605410685 +0200
+++ gcc/sanopt.c2021-07-27 19:16:45.667035649 +0200
@@ -492,7 +492,10 @@ maybe_optimize_ubsan_ptr_ifn (sanopt_ctx
  , , );
   if ((offset == NULL_TREE || TREE_CODE (offset) == INTEGER_CST)
  && DECL_P (base)
- && !DECL_REGISTER (base)
+ && ((!VAR_P (base)
+  && TREE_CODE (base) != PARM_DECL
+  && TREE_CODE (base) != RESULT_DECL)
+ || !DECL_REGISTER (base))
  && pbitpos.is_constant ())
{
  offset_int expr_offset;
--- gcc/testsuite/gfortran.dg/ubsan/ubsan.exp.jj2021-07-27 
19:59:24.889038766 +0200
+++ gcc/testsuite/gfortran.dg/ubsan/ubsan.exp   2021-07-27 20:00:18.538326168 
+0200
@@ -0,0 +1,38 @@
+# Copyright (C) 2021 Free Software Foundation, Inc.
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GCC is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# .
+
+# GCC testsuite for gfortran that checks undefined behavior sanitizer.
+
+# Load support procs.
+load_lib gfortran-dg.exp
+load_lib ubsan-dg.exp
+
+
+# Initialize `dg'.
+dg-init
+ubsan_init
+
+# Main loop.
+if [check_effective_target_fsanitize_undefined] {
+gfortran-dg-runtest [lsort \
+   [glob -nocomplain $srcdir/$subdir/*.\[fF\]{,90,95,03,08} ] ] "" ""
+}
+
+# All done.
+ubsan_finish
+dg-finish
--- gcc/testsuite/gfortran.dg/ubsan/pr101624.f90.jj 2021-07-27 
19:56:51.831071747 +0200
+++ gcc/testsuite/gfortran.dg/ubsan/pr101624.f902021-07-27 
19:59:14.634174975 +0200
@@ -0,0 +1,13 @@
+! PR middle-end/101624
+! { dg-do compile }
+! { dg-options "-O2 -fsanitize=undefined" }
+
+complex function foo (x)
+  complex, intent(in) :: x
+  foo = aimag (x)
+end
+program pr101624
+  complex, parameter :: a = (0.0, 1.0)
+  complex :: b, foo
+  b = foo (a)
+end

Jakub



[PATCH] match.pd: Fix up recent __builtin_bswap16 simplifications [PR101642]

2021-07-28 Thread Jakub Jelinek via Gcc-patches
Hi!

The following testcase ICEs.  The problem is that for __builtin_bswap16
(and only that, others are fine) the argument of the builtin is promoted
to int while the patterns assume it is not and is the same as that of
the return type.
For the bswap simplifications before these new ones it just means we
fail to optimize stuff like __builtin_bswap16 (__builtin_bswap16 (x))
because there are casts in between, but the last one, equality comparison
of __builtin_bswap16 with integer constant results in ICE, because
we create comparison with incompatible types of the operands, and the
other might be fine because usually we bit and the operand before promoting,
but I think it is too dangerous to rely on it, one day we find out that
because it is operand to such a built in, we can throw away any changes
that affect the upper bits and all of sudden it would misbehave.

So, this patch introduces converts that shouldn't do anything for
bswap{32,64,128} and should fix these issues for bswap16.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-07-28  Jakub Jelinek  

PR middle-end/101642
* match.pd (bswap16 (x) == bswap16 (y)): Cast both operands
to type of bswap16 for comparison.
(bswap16 (x) == cst): Cast bswap16 operand to type of cst.

* gcc.c-torture/compile/pr101642.c: New test.

--- gcc/match.pd.jj 2021-07-27 09:47:44.0 +0200
+++ gcc/match.pd2021-07-27 16:16:19.867757153 +0200
@@ -3641,11 +3641,13 @@ (define_operator_list COND_TERNARY
(bitop @0 (bswap @1
  (for cmp (eq ne)
   (simplify
-   (cmp (bswap @0) (bswap @1))
-   (cmp @0 @1))
+   (cmp (bswap@2 @0) (bswap @1))
+   (with { tree ctype = TREE_TYPE (@2); }
+(cmp (convert:ctype @0) (convert:ctype @1
   (simplify
(cmp (bswap @0) INTEGER_CST@1)
-   (cmp @0 (bswap @1
+   (with { tree ctype = TREE_TYPE (@1); }
+(cmp (convert:ctype @0) (bswap @1)
  /* (bswap(x) >> C1) & C2 can sometimes be simplified to (x >> C3) & C2.  */
  (simplify
   (bit_and (convert1? (rshift@0 (convert2? (bswap@4 @1)) INTEGER_CST@2))
--- gcc/testsuite/gcc.c-torture/compile/pr101642.c.jj   2021-07-27 
16:34:44.910039793 +0200
+++ gcc/testsuite/gcc.c-torture/compile/pr101642.c  2021-07-27 
16:32:24.661907592 +0200
@@ -0,0 +1,17 @@
+/* PR middle-end/101642 */
+
+int x;
+
+unsigned short
+foo (void)
+{
+  return __builtin_bswap16 (x) ? : 0;
+}
+
+int
+bar (int x, int y)
+{
+  unsigned short a = __builtin_bswap16 ((unsigned short) x);
+  unsigned short b = __builtin_bswap16 ((unsigned short) y);
+  return a == b;
+}

Jakub



[PATCH] i386: Improve extensions of __builtin_clz and constant - __builtin_clz for -mno-lzcnt [PR78103]

2021-07-28 Thread Jakub Jelinek via Gcc-patches
Hi!

This patch improves emitted code for the non-TARGET_LZCNT case.
As __builtin_clz* is UB on 0 argument and for !TARGET_LZCNT
CLZ_VALUE_DEFINED_AT_ZERO is 0, it is UB even at RTL time and so we
can take advantage of that and assume the result will be 0 to 31 or
0 to 63.
Given that, sign or zero extension of that result are the same and
are actually already performed by bsrl or xorl instructions.
And constant - __builtin_clz* can be simplified into
bsr + constant - bitmask.
For TARGET_LZCNT, a lot of this is already fine as is (e.g. the sign or
zero extensions), and other optimizations are IMHO not possible
(if we have lzcnt, we've lost information on whether it is UB at
zero or not and so can't transform it into bsr even when that is
1-2 insns shorter).
The changes on the 3 testcases between unpatched and patched gcc
are for -m64:
pr78103-1.s:
bsrq%rdi, %rax
-   xorq$63, %rax
-   cltq
+   xorl$63, %eax
...
bsrq%rdi, %rax
-   xorq$63, %rax
-   cltq
+   xorl$63, %eax
...
bsrl%edi, %eax
xorl$31, %eax
-   cltq
...
bsrl%edi, %eax
xorl$31, %eax
-   cltq
pr78103-2.s:
bsrl%edi, %edi
-   movl$32, %eax
-   xorl$31, %edi
-   subl%edi, %eax
+   leal1(%rdi), %eax
...
-   bsrl%edi, %edi
-   movl$31, %eax
-   xorl$31, %edi
-   subl%edi, %eax
+   bsrl%edi, %eax
...
bsrq%rdi, %rdi
-   movl$64, %eax
-   xorq$63, %rdi
-   subl%edi, %eax
+   leal1(%rdi), %eax
...
-   bsrq%rdi, %rdi
-   movl$63, %eax
-   xorq$63, %rdi
-   subl%edi, %eax
+   bsrq%rdi, %rax
pr78103-3.s:
bsrl%edi, %edi
-   movl$32, %eax
-   xorl$31, %edi
-   movslq  %edi, %rdi
-   subq%rdi, %rax
+   leaq1(%rdi), %rax
...
-   bsrl%edi, %edi
-   movl$31, %eax
-   xorl$31, %edi
-   movslq  %edi, %rdi
-   subq%rdi, %rax
+   bsrl%edi, %eax
...
bsrq%rdi, %rdi
-   movl$64, %eax
-   xorq$63, %rdi
-   movslq  %edi, %rdi
-   subq%rdi, %rax
+   leaq1(%rdi), %rax
...
-   bsrq%rdi, %rdi
-   movl$63, %eax
-   xorq$63, %rdi
-   movslq  %edi, %rdi
-   subq%rdi, %rax
+   bsrq%rdi, %rax

Most of the changes are done with combine splitters, but for
*bsr_rex64_2 and *bsr_2 I had to use define_insn_and_split, because
as mentioned in the PR the combiner unfortunately doesn't create LOG_LINKS
in between the two insns created by combine splitter, so it can't be
combined further with following instructions.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-07-28  Jakub Jelinek  

PR target/78103
* config/i386/i386.md (*bsr_rex64_1, *bsr_1, *bsr_zext_1): New
define_insn patterns.
(*bsr_rex64_2, *bsr_2): New define_insn_and_split patterns.
Add combine splitters for constant - clz.
(clz2): Use a temporary pseudo for bsr result.

* gcc.target/i386/pr78103-1.c: New test.
* gcc.target/i386/pr78103-2.c: New test.
* gcc.target/i386/pr78103-3.c: New test.

--- gcc/config/i386/i386.md.jj  2021-07-27 09:47:30.311970004 +0200
+++ gcc/config/i386/i386.md 2021-07-27 15:37:59.011394624 +0200
@@ -14761,6 +14761,18 @@ (define_insn "bsr_rex64"
(set_attr "znver1_decode" "vector")
(set_attr "mode" "DI")])
 
+(define_insn "*bsr_rex64_1"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+   (minus:DI (const_int 63)
+ (clz:DI (match_operand:DI 1 "nonimmediate_operand" "rm"
+   (clobber (reg:CC FLAGS_REG))]
+  "!TARGET_LZCNT && TARGET_64BIT"
+  "bsr{q}\t{%1, %0|%0, %1}"
+  [(set_attr "type" "alu1")
+   (set_attr "prefix_0f" "1")
+   (set_attr "znver1_decode" "vector")
+   (set_attr "mode" "DI")])
+
 (define_insn "bsr"
   [(set (reg:CCZ FLAGS_REG)
(compare:CCZ (match_operand:SI 1 "nonimmediate_operand" "rm")
@@ -14775,17 +14787,210 @@ (define_insn "bsr"
(set_attr "znver1_decode" "vector")
(set_attr "mode" "SI")])
 
+(define_insn "*bsr_1"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+   (minus:SI (const_int 31)
+ (clz:SI (match_operand:SI 1 "nonimmediate_operand" "rm"
+   (clobber (reg:CC FLAGS_REG))]
+  "!TARGET_LZCNT"
+  "bsr{l}\t{%1, %0|%0, %1}"
+  [(set_attr "type" "alu1")
+   (set_attr "prefix_0f" "1")
+   (set_attr "znver1_decode" "vector")
+   (set_attr "mode" "SI")])
+
+(define_insn "*bsr_zext_1"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+   (zero_extend:DI
+ (minus:SI
+   (const_int 31)
+   (clz:SI (match_operand:SI 1 "nonimmediate_operand" "rm")
+   (clobber (reg:CC FLAGS_REG))]
+  "!TARGET_LZCNT && TARGET_64BIT"
+  "bsr{l}\t{%1, %k0|%k0, %1}"
+  [(set_attr "type" "alu1")
+   (set_attr "prefix_0f" "1")
+   

[PATCH] [i386] Add a separate function to calculate cost for WIDEN_MULT_EXPR.

2021-07-28 Thread liuhongt via Gcc-patches
Hi:
  As described in PR 39821, WIDEN_MULT_EXPR should use a different cost
model from MULT_EXPR, this patch add ix86_widen_mult_cost for that.
Reference basis for the cost model is https://godbolt.org/z/EMjaz4Knn.

  Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.

gcc/ChangeLog:

* config/i386/i386.c (ix86_widen_mult_cost): New function.
(ix86_add_stmt_cost): Use ix86_widen_mult_cost for
WIDEN_MULT_EXPR.

gcc/testsuite/ChangeLog:

* gcc.target/i386/sse2-pr39821.c: New test.
* gcc.target/i386/sse4-pr39821.c: New test.
---
 gcc/config/i386/i386.c   | 48 +++-
 gcc/testsuite/gcc.target/i386/sse2-pr39821.c | 45 ++
 gcc/testsuite/gcc.target/i386/sse4-pr39821.c |  4 ++
 3 files changed, 96 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-pr39821.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse4-pr39821.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 876a19f4c1f..281b5fe2706 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -19757,6 +19757,44 @@ ix86_vec_cost (machine_mode mode, int cost)
   return cost;
 }
 
+/* Return cost of vec_widen_mult_hi/lo_,
+   vec_widen_mul_hi/lo_ is only available for VI124_AVX2.  */
+static int
+ix86_widen_mult_cost (const struct processor_costs *cost,
+ enum machine_mode mode, bool uns_p)
+{
+  gcc_assert (GET_MODE_CLASS (mode) == MODE_VECTOR_INT);
+  int extra_cost = 0;
+  int basic_cost = 0;
+  switch (mode)
+{
+case V8HImode:
+case V16HImode:
+  if (!uns_p || mode == V16HImode)
+   extra_cost = cost->sse_op * 2;
+  basic_cost = cost->mulss * 2 + cost->sse_op * 4;
+  break;
+case V4SImode:
+case V8SImode:
+  /* pmulhw/pmullw can be used.  */
+  basic_cost = cost->mulss * 2 + cost->sse_op * 2;
+  break;
+case V2DImode:
+  /* pmuludq under sse2, pmuldq under sse4.1, for sign_extend,
+require extra 4 mul, 4 add, 4 cmp and 2 shift.  */
+  if (!TARGET_SSE4_1 && !uns_p)
+   extra_cost = (cost->mulss + cost->addss + cost->sse_op) * 4
+ + cost->sse_op * 2;
+  /* Fallthru.  */
+case V4DImode:
+  basic_cost = cost->mulss * 2 + cost->sse_op * 4;
+  break;
+default:
+  gcc_unreachable();
+}
+  return ix86_vec_cost (mode, basic_cost + extra_cost);
+}
+
 /* Return cost of multiplication in MODE.  */
 
 static int
@@ -22483,10 +22521,18 @@ ix86_add_stmt_cost (class vec_info *vinfo, void 
*data, int count,
  break;
 
case MULT_EXPR:
-   case WIDEN_MULT_EXPR:
+ /*For MULT_HIGHPART_EXPR, x86 only supports pmulhw,
+   take it as MULT_EXPR.  */
case MULT_HIGHPART_EXPR:
  stmt_cost = ix86_multiplication_cost (ix86_cost, mode);
  break;
+ /* There's no direct instruction for WIDEN_MULT_EXPR,
+take emulation into account.  */
+   case WIDEN_MULT_EXPR:
+ stmt_cost = ix86_widen_mult_cost (ix86_cost, mode,
+   TYPE_UNSIGNED (vectype));
+ break;
+
case NEGATE_EXPR:
  if (SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH)
stmt_cost = ix86_cost->sse_op;
diff --git a/gcc/testsuite/gcc.target/i386/sse2-pr39821.c 
b/gcc/testsuite/gcc.target/i386/sse2-pr39821.c
new file mode 100644
index 000..bcd4b772c98
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/sse2-pr39821.c
@@ -0,0 +1,45 @@
+/* { dg-do compile } */
+/* { dg-options "-msse2 -mno-sse4.1 -O3 -fdump-tree-vect-details" } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 5 "vect" 
} } */
+#include
+void
+vec_widen_smul8 (int16_t* __restrict v3, int8_t *v1, int8_t *v2, int order)
+{
+  while (order--)
+*v3++ = (int16_t) *v1++ * *v2++;
+}
+
+void
+vec_widen_umul8(uint16_t* __restrict v3, uint8_t *v1, uint8_t *v2, int order)
+{
+  while (order--)
+*v3++ = (uint16_t) *v1++ * *v2++;
+}
+
+void
+vec_widen_smul16(int32_t* __restrict v3, int16_t *v1, int16_t *v2, int order)
+{
+  while (order--)
+*v3++ = (int32_t) *v1++ * *v2++;
+}
+
+void
+vec_widen_umul16(uint32_t* __restrict v3, uint16_t *v1, uint16_t *v2, int 
order)
+{
+  while (order--)
+*v3++ = (uint32_t) *v1++ * *v2++;
+}
+
+void
+vec_widen_smul32(int64_t* __restrict v3, int32_t *v1, int32_t *v2, int order)
+{
+  while (order--)
+*v3++ = (int64_t) *v1++ * *v2++;
+}
+
+void
+vec_widen_umul32(uint64_t* __restrict v3, uint32_t *v1, uint32_t *v2, int 
order)
+{
+  while (order--)
+*v3++ = (uint64_t) *v1++ * *v2++;
+}
diff --git a/gcc/testsuite/gcc.target/i386/sse4-pr39821.c 
b/gcc/testsuite/gcc.target/i386/sse4-pr39821.c
new file mode 100644
index 000..4456c31e43e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/sse4-pr39821.c
@@ -0,0 +1,4 @@
+/* { dg-do compile } */
+/* { dg-options "-msse4.1 -O3 -fdump-tree-vect-details" } */
+/* { dg-final { scan-tree-dump-times 

Re: [PATCH] Adjust docu of TARGET_VECTORIZE_VEC_PERM_CONST

2021-07-28 Thread Andreas Krebbel via Gcc-patches
On 7/28/21 9:43 AM, Richard Biener wrote:
> On Wed, Jul 28, 2021 at 8:44 AM Andreas Krebbel via Gcc-patches
>  wrote:
>>
>> There are also memory operands passed for in0 and in1.
>>
>> Ok for mainline?
> 
> They can also be constant vectors, I'd just not specify the operand
> kind - usually
> expanders are not limited as to what they feed down.

Right, I'll just replace "registers" with "operands" then. Ok?

 also to emit such a permutation.  In the former case @var{in0}, @var{in1}\n\
 and @var{out} are all null.  In the latter case @var{in0} and @var{in1} are\n\
 the source vectors and @var{out} is the destination vector; all three are\n\
-registers of mode @var{mode}.  @var{in1} is the same as @var{in0} if\n\
+operands of mode @var{mode}.  @var{in1} is the same as @var{in0} if\n\
 @var{sel} describes a permutation on one vector instead of two.\n\
 \n\
 Return true if the operation is possible, emitting instructions for it\n\

Andreas


Re: [PATCH] Adjust docu of TARGET_VECTORIZE_VEC_PERM_CONST

2021-07-28 Thread Richard Biener via Gcc-patches
On Wed, Jul 28, 2021 at 8:44 AM Andreas Krebbel via Gcc-patches
 wrote:
>
> There are also memory operands passed for in0 and in1.
>
> Ok for mainline?

They can also be constant vectors, I'd just not specify the operand
kind - usually
expanders are not limited as to what they feed down.

> gcc/ChangeLog:
>
> * target.def: Describe in0 and in1 as being either register or
> memory operands.
> * doc/tm.texi: Regenerate.
> ---
>  gcc/doc/tm.texi | 7 ---
>  gcc/target.def  | 7 ---
>  2 files changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> index c8f4abe3e41..31f188daf00 100644
> --- a/gcc/doc/tm.texi
> +++ b/gcc/doc/tm.texi
> @@ -6124,9 +6124,10 @@ This hook is used to test whether the target can 
> permute up to two
>  vectors of mode @var{mode} using the permutation vector @code{sel}, and
>  also to emit such a permutation.  In the former case @var{in0}, @var{in1}
>  and @var{out} are all null.  In the latter case @var{in0} and @var{in1} are
> -the source vectors and @var{out} is the destination vector; all three are
> -registers of mode @var{mode}.  @var{in1} is the same as @var{in0} if
> -@var{sel} describes a permutation on one vector instead of two.
> +the source vectors and @var{out} is the destination vector.  The destination
> +vector is a register of mode @var{mode} while the source vectors can be 
> either
> +register or memory operands of mode @var{mode}.  @var{in1} is the same as
> +@var{in0} if @var{sel} describes a permutation on one vector instead of two.
>
>  Return true if the operation is possible, emitting instructions for it
>  if rtxes are provided.
> diff --git a/gcc/target.def b/gcc/target.def
> index 2e40448e6c5..b368d81be63 100644
> --- a/gcc/target.def
> +++ b/gcc/target.def
> @@ -1860,9 +1860,10 @@ DEFHOOK
>  vectors of mode @var{mode} using the permutation vector @code{sel}, and\n\
>  also to emit such a permutation.  In the former case @var{in0}, @var{in1}\n\
>  and @var{out} are all null.  In the latter case @var{in0} and @var{in1} 
> are\n\
> -the source vectors and @var{out} is the destination vector; all three are\n\
> -registers of mode @var{mode}.  @var{in1} is the same as @var{in0} if\n\
> -@var{sel} describes a permutation on one vector instead of two.\n\
> +the source vectors and @var{out} is the destination vector.  The 
> destination\n\
> +vector is a register of mode @var{mode} while the source vectors can be 
> either\n\
> +register or memory operands of mode @var{mode}.  @var{in1} is the same as\n\
> +@var{in0} if @var{sel} describes a permutation on one vector instead of 
> two.\n\
>  \n\
>  Return true if the operation is possible, emitting instructions for it\n\
>  if rtxes are provided.\n\
> --
> 2.31.1
>


GCC 11.2.1 Status Report (2021-07-28)

2021-07-28 Thread Richard Biener


Status
==

The GCC 11.2.0 tarballs have been generated and uploaded and the
GCC 11 branch is again open for regression and documentation fixes.


Quality Data


Priority  #   Change from last report
---   ---
P1  
P2  262   +   2
P3  104   +   8
P4  206
P5   24
---   ---
Total P1-P3 366   +  10
Total   596   +  10


Previous Report
===

https://gcc.gnu.org/pipermail/gcc/2021-July/236836.html


retain debug stmt order when moving to successors

2021-07-28 Thread Alexandre Oliva


We iterate over debug stmts from the last one in new_bb, and we insert
them before the first post-label stmt in each dest block, without
moving the insertion iterator, so they end up reversed.  Moving the
insertion iterator fixes this.

Regstrapped on x86_64-linux-gnu.  Ok to install?

for  gcc/ChangeLog

* tree-inline.c (maybe_move_debug_stmts_to_successors): Don't
reverse debug stmts.
---
 gcc/tree-inline.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index 4a07d88f10bc5..b188a21df0e07 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -2868,7 +2868,7 @@ maybe_move_debug_stmts_to_successors (copy_body_data *id, 
basic_block new_bb)
  gimple_set_location (stmt, UNKNOWN_LOCATION);
}
  gsi_remove (, false);
- gsi_insert_before (, stmt, GSI_SAME_STMT);
+ gsi_insert_before (, stmt, GSI_NEW_STMT);
  continue;
}
 
@@ -2894,7 +2894,7 @@ maybe_move_debug_stmts_to_successors (copy_body_data *id, 
basic_block new_bb)
new_stmt = as_a  (gimple_copy (stmt));
  else
gcc_unreachable ();
- gsi_insert_before (, new_stmt, GSI_SAME_STMT);
+ gsi_insert_before (, new_stmt, GSI_NEW_STMT);
  id->debug_stmts.safe_push (new_stmt);
  gsi_prev ();
}


-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about 


don't access cfun in dump_function_to_file

2021-07-28 Thread Alexandre Oliva


dump_function_to_file takes the function to dump as a parameter, and
parts of it use the local fun variable where cfun would be used
elsewhere.  Others use cfun, presumably in error.  Fixed to use fun
uniformly.  Added a few more tests for non-NULL fun before
dereferencing it.

Regstrapped on x86_64-linux-gnu.  Ok to install?


for  gcc/ChangeLog

* tree-cfg.c (dump_function_to_file): Use fun, not cfun.
---
 gcc/tree-cfg.c |   10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 30b1b56293e3b..38269a27b7978 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -8074,9 +8074,9 @@ dump_function_to_file (tree fndecl, FILE *file, 
dump_flags_t flags)
   : (fun->curr_properties & PROP_cfg) ? "cfg"
   : "");
 
-  if (cfun->cfg)
+  if (fun && fun->cfg)
{
- basic_block bb = ENTRY_BLOCK_PTR_FOR_FN (cfun);
+ basic_block bb = ENTRY_BLOCK_PTR_FOR_FN (fun);
  if (bb->count.initialized_p ())
fprintf (file, ",%s(%" PRIu64 ")",
 profile_quality_as_string (bb->count.quality ()),
@@ -8162,8 +8162,8 @@ dump_function_to_file (tree fndecl, FILE *file, 
dump_flags_t flags)
 
   tree name;
 
-  if (gimple_in_ssa_p (cfun))
-   FOR_EACH_SSA_NAME (ix, name, cfun)
+  if (gimple_in_ssa_p (fun))
+   FOR_EACH_SSA_NAME (ix, name, fun)
  {
if (!SSA_NAME_VAR (name)
/* SSA name with decls without a name still get
@@ -8199,7 +8199,7 @@ dump_function_to_file (tree fndecl, FILE *file, 
dump_flags_t flags)
 
   fprintf (file, "}\n");
 }
-  else if (fun->curr_properties & PROP_gimple_any)
+  else if (fun && (fun->curr_properties & PROP_gimple_any))
 {
   /* The function is now in GIMPLE form but the CFG has not been
 built yet.  Emit the single sequence of GIMPLE statements

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about 


Re: [PATCH] incorrect arguments designated in -Wnonnull for arrays

2021-07-28 Thread Uecker, Martin
Am Dienstag, den 27.07.2021, 10:55 -0600 schrieb Martin Sebor:
> On 7/26/21 12:22 PM, Jeff Law via Gcc-patches wrote:
> > 
> > On 7/25/2021 10:23 AM, Uecker, Martin wrote:
> > > Two arguments are switched for -Wnonnull when
> > > warning about array parameters with bounds > 0
> > > and which are NULL.
> > > 
> > > This patch corrects the mistake.
> > > 
> > > Martin
> > > 
> > > 
> > > 2021-07-25  Martin Uecker  
> > > 
> > > gcc/
> > >   * calls.c (maybe_warn_rdwr_sizes): Correct argument
> > >   numbers in warning that were switched.
> > > 
> > > gcc/testsuite/
> > >   * gcc.dg/Wnonnull-4.c: Correct argument numbers in warnings.
> > I'll defer to Martin Sebor on this.
> > 
> > Martin S., can you cover the review of this patch from Martin U?
> 
> The patch is correct.  Thanks for the fix!  It would ideally go
> into GCC 11 as well.

Committed.

Should I also push it to origin/releases/gcc-11 ?


Martin



Re: [PATCH] IBM Z: Enable LSan and TSan

2021-07-28 Thread Andreas Krebbel via Gcc-patches
On 7/27/21 10:04 PM, Ilya Leoshkevich via Gcc-patches wrote:
> Bootstrapped and regtested on s390x-redhat-linux.  Ok for master?
> 
> libsanitizer/ChangeLog:
> 
>   * configure.tgt (s390*-*-linux*): Enable LSan and TSan for
>   s390x.

Ok. Thanks!

Andreas


[PATCH] Adjust docu of TARGET_VECTORIZE_VEC_PERM_CONST

2021-07-28 Thread Andreas Krebbel via Gcc-patches
There are also memory operands passed for in0 and in1.

Ok for mainline?

gcc/ChangeLog:

* target.def: Describe in0 and in1 as being either register or
memory operands.
* doc/tm.texi: Regenerate.
---
 gcc/doc/tm.texi | 7 ---
 gcc/target.def  | 7 ---
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index c8f4abe3e41..31f188daf00 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -6124,9 +6124,10 @@ This hook is used to test whether the target can permute 
up to two
 vectors of mode @var{mode} using the permutation vector @code{sel}, and
 also to emit such a permutation.  In the former case @var{in0}, @var{in1}
 and @var{out} are all null.  In the latter case @var{in0} and @var{in1} are
-the source vectors and @var{out} is the destination vector; all three are
-registers of mode @var{mode}.  @var{in1} is the same as @var{in0} if
-@var{sel} describes a permutation on one vector instead of two.
+the source vectors and @var{out} is the destination vector.  The destination
+vector is a register of mode @var{mode} while the source vectors can be either
+register or memory operands of mode @var{mode}.  @var{in1} is the same as
+@var{in0} if @var{sel} describes a permutation on one vector instead of two.
 
 Return true if the operation is possible, emitting instructions for it
 if rtxes are provided.
diff --git a/gcc/target.def b/gcc/target.def
index 2e40448e6c5..b368d81be63 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -1860,9 +1860,10 @@ DEFHOOK
 vectors of mode @var{mode} using the permutation vector @code{sel}, and\n\
 also to emit such a permutation.  In the former case @var{in0}, @var{in1}\n\
 and @var{out} are all null.  In the latter case @var{in0} and @var{in1} are\n\
-the source vectors and @var{out} is the destination vector; all three are\n\
-registers of mode @var{mode}.  @var{in1} is the same as @var{in0} if\n\
-@var{sel} describes a permutation on one vector instead of two.\n\
+the source vectors and @var{out} is the destination vector.  The destination\n\
+vector is a register of mode @var{mode} while the source vectors can be 
either\n\
+register or memory operands of mode @var{mode}.  @var{in1} is the same as\n\
+@var{in0} if @var{sel} describes a permutation on one vector instead of two.\n\
 \n\
 Return true if the operation is possible, emitting instructions for it\n\
 if rtxes are provided.\n\
-- 
2.31.1