date:20210209

This patch adds some XFAILs for PR98979 until the patch to fix them has
been approved. See:

  https://gcc.gnu.org/pipermail/gcc-patches/2021-February/564711.html

gcc/testsuite/
PR fortran/98979
* gfortran.dg/goacc/array-with-dt-2.f90: Add expected errors.
* gfortran.dg/goacc/derived-chartypes-1.f90: Skip ICEing test.
* gfortran.dg/goacc/derived-chartypes-2.f90: Likewise.

libgomp/
PR fortran/98979
* testsuite/libgomp.oacc-fortran/array-stride-dt-1.f90: Add expected
errors.
---
 gcc/testsuite/gfortran.dg/goacc/array-with-dt-2.f90  | 5 +++--
 gcc/testsuite/gfortran.dg/goacc/derived-chartypes-1.f90  | 3 +++
 gcc/testsuite/gfortran.dg/goacc/derived-chartypes-2.f90  | 3 +++
 libgomp/testsuite/libgomp.oacc-fortran/array-stride-dt-1.f90 | 5 +++--
 4 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/gfortran.dg/goacc/array-with-dt-2.f90 
b/gcc/testsuite/gfortran.dg/goacc/array-with-dt-2.f90
index 807580d75a9..e4a6f319772 100644
--- a/gcc/testsuite/gfortran.dg/goacc/array-with-dt-2.f90
+++ b/gcc/testsuite/gfortran.dg/goacc/array-with-dt-2.f90
@@ -4,7 +4,8 @@ end type t
 
 type(t), allocatable :: b(:)
 
-!$acc update host(b(::2))
-!$acc update host(b(1)%A(::3,::4))
+! TODO: Remove expected errors when this is supported.
+!$acc update host(b(::2))  ! { dg-error "Stride should not be specified for 
array section in MAP clause" }
+!$acc update host(b(1)%A(::3,::4))  ! { dg-error "Stride should not be 
specified for array section in MAP clause" }
 end
 
diff --git a/gcc/testsuite/gfortran.dg/goacc/derived-chartypes-1.f90 
b/gcc/testsuite/gfortran.dg/goacc/derived-chartypes-1.f90
index e4d360e1262..f7aafbfc036 100644
--- a/gcc/testsuite/gfortran.dg/goacc/derived-chartypes-1.f90
+++ b/gcc/testsuite/gfortran.dg/goacc/derived-chartypes-1.f90
@@ -1,3 +1,6 @@
+! This currently ICEs. Avoid that.
+! { dg-skip-if "PR98979" { *-*-* } }
+
 type :: type1
   character(len=35) :: a
 end type type1
diff --git a/gcc/testsuite/gfortran.dg/goacc/derived-chartypes-2.f90 
b/gcc/testsuite/gfortran.dg/goacc/derived-chartypes-2.f90
index cca6443e7fc..e22fc679df2 100644
--- a/gcc/testsuite/gfortran.dg/goacc/derived-chartypes-2.f90
+++ b/gcc/testsuite/gfortran.dg/goacc/derived-chartypes-2.f90
@@ -1,3 +1,6 @@
+! This currently ICEs. Avoid that.
+! { dg-skip-if "PR98979" { *-*-* } }
+
 type :: type1
   character(len=35,kind=4) :: a
 end type type1
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/array-stride-dt-1.f90 
b/libgomp/testsuite/libgomp.oacc-fortran/array-stride-dt-1.f90
index f04d76d583a..61250708197 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/array-stride-dt-1.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/array-stride-dt-1.f90
@@ -24,8 +24,9 @@ end do
 
 b(1)%A(:,:) = 5
 
-!$acc update device(b(::2))
-!$acc update device(b(1)%A(::3,::4))
+! TODO: Remove expected errors once this is supported.
+!$acc update device(b(::2))  ! { dg-error "Stride should not be specified for 
array section in MAP clause" }
+!$acc update device(b(1)%A(::3,::4))  ! { dg-error "Stride should not be 
specified for array section in MAP clause" }
 
 do i=1,20
   !$acc exit data copyout(b(i)%A)
-- 
2.29.2

[PATCH 1/2] Revert "openacc: Allow strided arrays in update directives"

This patch reverts the non-testsuite parts of commit
9a4d32f85ccebc0ee4b24e6d9d7a4f11c04d7146 which cause ICEs without the
yet-to-be-approved patch here:

  https://gcc.gnu.org/pipermail/gcc-patches/2021-February/564711.html

gcc/fortran/

PR fortran/98979
* openmp.c (resolve_omp_clauses): Omit OpenACC update in
contiguity check and stride-specified error.
---
 gcc/fortran/openmp.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index 797f6c86b62..aab17f0589f 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -5192,8 +5192,7 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses 
*omp_clauses,
   array isn't contiguous.  An expression such as
   arr(-n:n,-n:n) could be contiguous even if it looks
   like it may not be.  */
-   if (code->op != EXEC_OACC_UPDATE
-   && list != OMP_LIST_CACHE
+   if (list != OMP_LIST_CACHE
&& list != OMP_LIST_DEPEND
&& !gfc_is_simply_contiguous (n->expr, false, true)
&& gfc_is_not_contiguous (n->expr))
@@ -5231,7 +5230,7 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses 
*omp_clauses,
int i;
gfc_array_ref *ar = _ref->u.ar;
for (i = 0; i < ar->dimen; i++)
- if (ar->stride[i] && code->op != EXEC_OACC_UPDATE)
+ if (ar->stride[i])
{
  gfc_error ("Stride should not be specified for "
 "array section in %s clause at %L",
-- 
2.29.2

[PATCH 0/2] openacc: Fix/temp workaround for PR98979

Hi,

Until a patch to fix derived type mappings with array elements, etc. is
approved, this small patch series (a) reverts the strided arrays in
update directives patch, and (b) adds some XFAILS/skips for the failing
tests flagged in PR98979.

I will apply shortly as obvious.

Thanks,

Julian

Julian Brown (2):
  Revert "openacc: Allow strided arrays in update directives"
  openacc: Add XFAILs [PR98979]

 gcc/fortran/openmp.c | 5 ++---
 gcc/testsuite/gfortran.dg/goacc/array-with-dt-2.f90  | 5 +++--
 gcc/testsuite/gfortran.dg/goacc/derived-chartypes-1.f90  | 3 +++
 gcc/testsuite/gfortran.dg/goacc/derived-chartypes-2.f90  | 3 +++
 libgomp/testsuite/libgomp.oacc-fortran/array-stride-dt-1.f90 | 5 +++--
 5 files changed, 14 insertions(+), 7 deletions(-)

-- 
2.29.2

Patch for PR analyzer/98797

2021-02-09 Thread brian.sobulefsky via Gcc-patches

The attached patch has been bootstrapped and regression tested. It addresses PR 
analyzer/98797, which is built off the expected failures in 
gcc/testsuite/gcc.dg/analyzer/casts-1.c It fixes those failures and additional 
manifestations. That testsuite file has been updated to no longer expect 
failures and to include additional tests addressing the other manifestations I 
found.

Some additional work can probably be done to handle further cases, but this is 
now substantially more robust than the current status.

Thanks.commit f3654c5a322f241c8cc551f88257a495689ed9d9
Author: Brian Sobulefsky 
Date:   Tue Feb 9 16:25:43 2021 -0800

Address the bug found in test file gcc/testsuite/gcc.dg/analyzer/casts-1.c, 
as
recorded by the XFAIL, and some simpler and more complex versions found 
during
the investigation of it. PR analyzer/98797 was opened for these bugs.

The simplest manifestation of this bug can be replicated with:

char arr[] = {'A', 'B', 'C', 'D'};
char *pt = ar;
__analyzer_eval(*(pt + 0) == 'A');
__analyzer_eval(*(pt + 1) == 'B');
//etc

The result of the eval call is "UNKNOWN" because the analyzer is unable to
determine the value at the pointer. Note that array access (such as arr[0]) 
is
correctly handled. The code responsible for this is in file
region-model-manager.cc, function 
region_model_manager::maybe_fold_sub_svalue.
The relevant section is commented /* Handle getting individual chars from
STRING_CST */. This section only had a case for an element_region. A case
needed to be added for an offset_region.

Additionally, when the offset was 0, such as in *pt or *(pt + 0), the 
function
region_model_manager::get_offset_region was failing to make the needed 
offset
region at all. This was due to the test for a constant 0 pointer that was 
then
returning get_cast_region. A special case is needed for when PARENT is of 
type
array_type whose type matches TYPE. In this case, get_offset_region is 
allowed
to continue to normal conclusion.

The original bug noted in gcc/testsuite/gcc.dg/analyzer/casts-1.c was for 
the
case:

struct s1
{
  char a;
  char b;
  char c;
  char d;
};

struct s2
{
  char arr[4];
};

struct s2 x = {{'A', 'B', 'C', 'D'}};
struct s1 *p = (struct s1 *)
__analyzer_eval (p->a == 'A');
//etc

This requires a case added to region_model_manager::maybe_fold_sub_svalue in
the individual characters from string constant section for a field region.

Finally, the prior only works for the case where struct s2 was a single 
field
struct. The more general case is:

struct s2
{
  char arr1[2];
  char arr2[2];
};

struct s2 x = {{'A', 'B'}, {'C', 'D'}};

Here the array will never be found in the get_any_binding routines. A new
routine is added to class binding_cluster that checks if the region being
searched is a field_region of a cast_region, and if so, tries to find a 
field
of the original region that contains the region under examination. This new
function is named binding_cluster::get_parent_cast_binding. It is called 
from
get_binding_recursive.

gcc/analyzer/ChangeLog:
PR analyzer/98797
* region-model-manager.cc region_model_manager::get_offset_region: 
Add
check for a PARENT array whose type matches TYPE, and have this skip
returning get_cast_region and rather conclude the function normally.
* region-model-manager.cc 
region_model_manager::maybe_fold_sub_svalue
Update the get character from string_cst section to include cases 
for
an offset_region and a field_region.
* store.cc binding_cluster::get_binding_recursive: Add case for call
to new function get_parent_cast_binding.
* store.cc binding_cluster::get_parent_cast_binding: New function.
* store.h class binding_cluster: Add declaration for new member
function get_parent_class_binding and macros for bit to byte
conversion.

gcc/testsuite/ChangeLog:
PR analyzer/98797
* gcc.dg/analyzer/casts-1.c: Update file to no longer expect 
failures
and add test cases for additional bugs solved.

diff --git a/gcc/analyzer/region-model-manager.cc 
b/gcc/analyzer/region-model-manager.cc
index dfd2413e914..9aee4c498c6 100644
--- a/gcc/analyzer/region-model-manager.cc
+++ b/gcc/analyzer/region-model-manager.cc
@@ -602,16 +602,46 @@ region_model_manager::maybe_fold_sub_svalue (tree type,
   /* Handle getting individual chars from a STRING_CST.  */
   if (tree cst = parent_svalue->maybe_get_constant ())
 if (TREE_CODE (cst) == STRING_CST)
-  if (const element_region *element_reg
+  {
+   if (const element_region *element_reg
=

Re: [PATCH 3/4] c++: Delay normalizing nested requirements until satisfaction


On 2/9/21 6:04 PM, Jason Merrill wrote:

On 2/8/21 2:03 PM, Patrick Palka wrote:

This sets up the functionality for controlling the initial set of
template parameters to pass to normalization when dealing with a
constraint-expression that is not associated with some constrained
declaration, for instance when normalizing a nested requirement of a
requires expression, or the constraints on a placeholder type.

The main new ingredient here is the data member norm_info::initial_parms
which can be set by callers of the normalization routines to communicate
the in-scope template parameters for the supplied constraint-expression,
rather than always falling back to using current_template_parms.

This patch then uses this functionality in our handling of nested
requirements so that we can delay normalizing them until needed for
satisfaction.  We currently immediately normalize nested requirements at
parse time, where we have the necessary template context, and cache the
normal form in their TREE_TYPE node.  With this patch, we now delay
normalization until needed (as with other constraint expressions), and
instead store the current value of current_template_parms in their
TREE_TYPE node (which we use to restore the template context at
normalization time).

In the subsequent patch, this functionality will also be used to
normalize placeholder type constraints during auto deduction.

gcc/cp/ChangeLog:

* constraint.cc (build_parameter_mapping): Rely on the caller to
determine the in-scope template parameters.
(norm_info::norm_info): Delegate the one-parameter constructor
to the two-parameter constructor.  In the two-parameter
constructor, fold in the definition of make_context, set
initial_parms appropriately, and don't set the now-removed
orig_decl member.
(norm_info::make_context): Remove, now that its only use is
inlined into the caller.
(norm_info::update_context): Adjust call to
build_parameter_mapping to pass in the relevant set of in-scope
template parameters.
(norm_info::ctx_parms): Define this member function.
(norm_info::context): Initialize to NULL_TREE.
(norm_info::orig_decl): Remove this data member.
(norm_info::initial_parms): Define this data member.
(normalize_atom): Adjust call to build_parameter_mapping to pass
in the relevant set of in-scope template parameters.  Use
info.initial_parms instead of info.orig_decl.
(normalize_constraint_expression): Define an overload that takes
a norm_info object.  Cache the result of normalization.  Define
the other overload in terms of this one, and handle a NESTED_REQ
argument by setting info.initial_parms appropriately.
(tsubst_nested_requirement): Go through
satisfy_constraint_expression so that we normalize on demand.
(finish_nested_requirement): Set the TREE_TYPE of the NESTED_REQ
to current_template_parms.
(diagnose_nested_requirements): Go through
satisfy_constraint_expression, as with tsubst_nested_requirement.
---
  gcc/cp/constraint.cc | 140 +++
  gcc/cp/cp-tree.h |   4 +-
  2 files changed, 78 insertions(+), 66 deletions(-)

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 39c97986082..56134f8b2bf 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -133,7 +133,7 @@ struct sat_info : subst_info
    bool diagnose_unsatisfaction;
  };
-static tree satisfy_constraint (tree, tree, sat_info);
+static tree satisfy_constraint_expression (tree, tree, sat_info);
  /* True if T is known to be some type other than bool. Note that this
 is false for dependent types and errors.  */
@@ -594,26 +594,12 @@ map_arguments (tree parms, tree args)
    return parms;
  }
-/* Build the parameter mapping for EXPR using ARGS.  */
+/* Build the parameter mapping for EXPR using ARGS, where CTX_PARMS
+   are the template parameters in scope for EXPR.  */
  static tree
-build_parameter_mapping (tree expr, tree args, tree decl)
+build_parameter_mapping (tree expr, tree args, tree ctx_parms)
  {
-  tree ctx_parms = NULL_TREE;
-  if (decl)
-    {
-  gcc_assert (TREE_CODE (decl) == TEMPLATE_DECL);
-  ctx_parms = DECL_TEMPLATE_PARMS (decl);
-    }
-  else if (current_template_parms)
-    {
-  /* TODO: This should probably be the only case, but because the
- point of declaration of concepts is currently set after the
- initializer, the template parameter lists are not available
- when normalizing concept definitions, hence the case above.  */
-  ctx_parms = current_template_parms;
-    }
-
    tree parms = find_template_parameters (expr, ctx_parms);
    tree map = map_arguments (parms, args);
    return map;
@@ -645,53 +631,63 @@ parameter_mapping_equivalent_p (tree t1, tree t2)
  struct norm_info : subst_info
  {
-  explicit norm_info (tsubst_flags_t complain)
-    : subst_info (tf_warning_or_error | complain, NULL_TREE),
-  context()
+  explicit norm_info

[committed] add links to manual


While editing changes.html I noticed a few missing links to the user
manual.  I've pushed the attached change to add them.

Martin
commit 5a57e261bcfbb7691901bdf219ba114b449b690e
Author: Martin Sebor 
Date:   Tue Feb 9 17:40:01 2021 -0700

Move -Wtsan under New Warnings.  Add more links to the user manual.

diff --git a/htdocs/gcc-11/changes.html b/htdocs/gcc-11/changes.html
index 80d39b73..de75b8d6 100644
--- a/htdocs/gcc-11/changes.html
+++ b/htdocs/gcc-11/changes.html
@@ -174,7 +174,8 @@ a work-in-progress.
 
   New attributes:
 
-The no_stack_protector attribute has been added to mark functions which should not be instrumented
+  The https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-no_005fstack_005fprotector-function-attribute;>no_stack_protector
+	attribute has been added to mark functions which should not be instrumented
 with stack protection (-fstack-protector).
 	The existing
 	  https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-malloc-function-attribute;>malloc
@@ -205,6 +206,10 @@ a work-in-progress.
 	GCC releases most instances of his warning are diagnosed by
 	-Wstringop-overflow.
   
+  https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wtsan;>-Wtsan,
+	enabled by default, warns about unsupported features in ThreadSanitizer
+	(currently std::atomic_thread_fence).
+  
 
   
   Enhancements to existing warnings:
@@ -238,8 +243,6 @@ a work-in-progress.
 This functionality requires Binutils version 2.36 or later.
 
   
-  A new warning https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wtsan;>-Wtsan, enabled by default,
-  warns about unsupported features in ThreadSanitizer (currently std::atomic_thread_fence).
   A series of conditional expressions that compare the same variable can be transformed into a switch statement
   if each of them contains a comparison expression.  Example:
   
@@ -376,19 +379,23 @@ a work-in-progress.
   (https://gcc.gnu.org/PR97518;>PR97518).
   New warnings:
 
-  -Wctad-maybe-unsupported, disabled by default, warns
+  https://gcc.gnu.org/onlinedocs/gcc/C_002b_002b-Dialect-Options.html#index-Wctad-maybe-unsupported;>-Wctad-maybe-unsupported,
+	disabled by default, warns
 	  about performing class template argument deduction on a type with no
 	  deduction guides.
   
-  -Wrange-loop-construct, enabled by -Wall,
+  https://gcc.gnu.org/onlinedocs/gcc/C_002b_002b-Dialect-Options.html#index-Wrange-loop-construct;>-Wrange-loop-construct,
+	enabled by -Wall,
 	  warns when a range-based for-loop is creating unnecessary and
 	  expensive copies.
   
-  -Wdeprecated-enum-enum-conversion, enabled by default in
+  https://gcc.gnu.org/onlinedocs/gcc/C_002b_002b-Dialect-Options.html#index-Wdeprecated-enum-enum-conversion;>-Wdeprecated-enum-enum-conversion,
+	enabled by default in
 	  C++20, warns about deprecated arithmetic conversions on operands of
 	  enumeration types, as outlined in [depr.arith.conv.enum].
   
-  -Wdeprecated-enum-float-conversion, enabled by default in
+  https://gcc.gnu.org/onlinedocs/gcc/C_002b_002b-Dialect-Options.html#index-Wdeprecated-enum-float-conversion;>-Wdeprecated-enum-float-conversion,
+	enabled by default in
 	  C++20, warns about deprecated arithmetic conversions on operands where
 	  one is of enumeration type and the other is of a floating-point type,
 	  as outlined in [depr.arith.conv.enum].
@@ -399,7 +406,8 @@ a work-in-progress.
 	forms of operator new or from other mismatched allocation
 	functions.
   
-  -Wvexing-parse, enabled by default, warns about the most
+  https://gcc.gnu.org/onlinedocs/gcc/C_002b_002b-Dialect-Options.html#index-Wvexing-parse;>-Wvexing-parse,
+	enabled by default, warns about the most
 	  vexing parse rule: the cases when a declaration looks like a variable
 	  definition, but the C++ language requires it to be interpreted as a
 	  function declaration.

[committed] GCC 11 changes


I pushed the attached change documenting my GCC 11 changes.

I validated the file on https://validator.w3.org.

Martin
commit cf0d4e41a94bae204a8c5d2490063d58cdb1d4e3
Author: Martin Sebor 
Date:   Tue Feb 9 17:12:16 2021 -0700

Update new attribute malloc and document new and enhanced warnings.

diff --git a/htdocs/gcc-11/changes.html b/htdocs/gcc-11/changes.html
index 63efaf37..80d39b73 100644
--- a/htdocs/gcc-11/changes.html
+++ b/htdocs/gcc-11/changes.html
@@ -179,21 +179,47 @@ a work-in-progress.
 	The existing
 	  https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-malloc-function-attribute;>malloc
 	  attribute has been extended so that it can be used to identify
-	  allocator/deallocator API pairs.  A new
-	  https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#-Wmismatched-dealloc;>-Wmismatched-dealloc
-	  warning will complain about incorrect calls.  Additionally, the
-	  static analyzer will use these attributes when checking for leaks,
-	  double-frees, use-after-frees, and similar issues.
+	  allocator/deallocator API pairs.  A pair of new
+	  https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wmismatched-dealloc;>-Wmismatched-dealloc and https://gcc.gnu.org/onlinedocs/gcc/C_002b_002b-Dialect-Options.html#index-Wmismatched-new-delete;>-Wmismatched-new-delete warnings will complain
+	  about mismatched calls, and https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wfree-nonheap-object;>-Wfree-nonheap-object about deallocation calls with pointers not obtained from allocation
+	  functions.  Additionally, the static analyzer will use these
+	  attributes when checking for leaks, double-frees, use-after-frees,
+	  and similar issues.
 	
 
   New warnings:
 
+  https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wmismatched-dealloc;>-Wmismatched-dealloc,
+	enabled by default, warns about calls to deallocation functions
+	with pointers returned from mismatched allocation functions.
+  
   https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wsizeof-array-div;>-Wsizeof-array-div,
 	  enabled by -Wall, warns
 	  about divisions of two sizeof operators when the first one is applied
 	  to an array and the divisor does not equal the size of the array
 	  element.
   
+  https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wstringop-overread;>-Wstringop-overread,
+	enabled by default, warns about calls to string functions reading
+	past the end of the arrays passed to them as arguments.  In prior
+	GCC releases most instances of his warning are diagnosed by
+	-Wstringop-overflow.
+  
+
+  
+  Enhancements to existing warnings:
+
+  https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wfree-nonheap-object;>-Wfree-nonheap-object
+	detects many more instances of calls to deallocation functions with
+	pointers not returned from a dynamic memory allocation function.
+  
+  https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wmaybe-uninitialized;>-Wmaybe-uninitialized
+	diagnoses passing pointers or references to uninitialized memory
+	to functions taking const-qualified arguments.
+  
+  https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wuninitialized;>-Wuninitialized
+	detects reads from uninitialized dynamically allocated memory.
+  
 
   
   
@@ -275,6 +301,22 @@ a work-in-progress.
 Labels may appear before declarations and at the end of a
 compound statement.
   
+  New warnings:
+
+  https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Warray-parameter;>-Warray-parameter,
+	enabled by -Wall, warns about redeclarations of functions
+	with ordinary array arguments declared using inconsistent forms.
+	The warning also enables the detection of the likely out of bounds
+	accesses in calls to such functions with smaller arrays.
+  
+  https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wvla-parameter;>-Wvla-parameter,
+	enabled by -Wall, warns redeclarations of functions
+	with variable length array arguments declared using inconsistent
+	forms or with mismatched bounds.  The warning also enables
+	the detection of the likely out of bounds accesses in calls to
+	such functions with smaller arrays.
+  
+
 
 
 C++
@@ -351,12 +393,27 @@ a work-in-progress.
 	  one is of enumeration type and the other is of a floating-point type,
 	  as outlined in [depr.arith.conv.enum].
   
+  https://gcc.gnu.org/onlinedocs/gcc/C_002b_002b-Dialect-Options.html#index-Wmismatched-new-delete;>-Wmismatched-new-delete,
+	enabled by -Wall, warns about calls to C++
+	operator delete with pointers returned from mismatched
+	forms of operator new or from other mismatched allocation
+	functions.
+  
   -Wvexing-parse, enabled by default, warns about the most
 	  vexing parse rule: the cases when a declaration looks like a variable
 	  definition, but the C++ language requires it to be interpreted

Re: [PATCH 2/5] Make sure that static data member constexpr isn't optimized away in test.

On Tue, Feb 09, 2021 at 08:55:26PM +0100, Jakub Jelinek wrote:
> On Tue, Feb 09, 2021 at 02:40:12PM -0500, Jason Merrill wrote:
> > For GCC 11, I think let's fix the regression with your (Jakub) earlier
> > patch, maybe only for DIEs with DW_AT_const_value.
> 
> Thanks.
> Following works too, so I'll test it tonight.

The patch fixed not just the expected
-FAIL: g++.dg/debug/dwarf2/constexpr-var-1.C   scan-assembler-times  
DW_AT_const_expr 2
but also
-FAIL: libstdc++-prettyprinters/80276.cc whatis p4
-FAIL: libstdc++-prettyprinters/80276.cc whatis p4
-FAIL: libstdc++-prettyprinters/libfundts.cc print as
-FAIL: libstdc++-prettyprinters/libfundts.cc print as
-FAIL: libstdc++-prettyprinters/libfundts.cc print os
-FAIL: libstdc++-prettyprinters/libfundts.cc print os

Jakub

Re: [PATCH 3/4] c++: Delay normalizing nested requirements until satisfaction


On 2/8/21 2:03 PM, Patrick Palka wrote:

This sets up the functionality for controlling the initial set of
template parameters to pass to normalization when dealing with a
constraint-expression that is not associated with some constrained
declaration, for instance when normalizing a nested requirement of a
requires expression, or the constraints on a placeholder type.

The main new ingredient here is the data member norm_info::initial_parms
which can be set by callers of the normalization routines to communicate
the in-scope template parameters for the supplied constraint-expression,
rather than always falling back to using current_template_parms.

This patch then uses this functionality in our handling of nested
requirements so that we can delay normalizing them until needed for
satisfaction.  We currently immediately normalize nested requirements at
parse time, where we have the necessary template context, and cache the
normal form in their TREE_TYPE node.  With this patch, we now delay
normalization until needed (as with other constraint expressions), and
instead store the current value of current_template_parms in their
TREE_TYPE node (which we use to restore the template context at
normalization time).

In the subsequent patch, this functionality will also be used to
normalize placeholder type constraints during auto deduction.

gcc/cp/ChangeLog:

* constraint.cc (build_parameter_mapping): Rely on the caller to
determine the in-scope template parameters.
(norm_info::norm_info): Delegate the one-parameter constructor
to the two-parameter constructor.  In the two-parameter
constructor, fold in the definition of make_context, set
initial_parms appropriately, and don't set the now-removed
orig_decl member.
(norm_info::make_context): Remove, now that its only use is
inlined into the caller.
(norm_info::update_context): Adjust call to
build_parameter_mapping to pass in the relevant set of in-scope
template parameters.
(norm_info::ctx_parms): Define this member function.
(norm_info::context): Initialize to NULL_TREE.
(norm_info::orig_decl): Remove this data member.
(norm_info::initial_parms): Define this data member.
(normalize_atom): Adjust call to build_parameter_mapping to pass
in the relevant set of in-scope template parameters.  Use
info.initial_parms instead of info.orig_decl.
(normalize_constraint_expression): Define an overload that takes
a norm_info object.  Cache the result of normalization.  Define
the other overload in terms of this one, and handle a NESTED_REQ
argument by setting info.initial_parms appropriately.
(tsubst_nested_requirement): Go through
satisfy_constraint_expression so that we normalize on demand.
(finish_nested_requirement): Set the TREE_TYPE of the NESTED_REQ
to current_template_parms.
(diagnose_nested_requirements): Go through
satisfy_constraint_expression, as with tsubst_nested_requirement.
---
  gcc/cp/constraint.cc | 140 +++
  gcc/cp/cp-tree.h |   4 +-
  2 files changed, 78 insertions(+), 66 deletions(-)

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 39c97986082..56134f8b2bf 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -133,7 +133,7 @@ struct sat_info : subst_info
bool diagnose_unsatisfaction;
  };
  
-static tree satisfy_constraint (tree, tree, sat_info);

+static tree satisfy_constraint_expression (tree, tree, sat_info);
  
  /* True if T is known to be some type other than bool. Note that this

 is false for dependent types and errors.  */
@@ -594,26 +594,12 @@ map_arguments (tree parms, tree args)
return parms;
  }
  
-/* Build the parameter mapping for EXPR using ARGS.  */

+/* Build the parameter mapping for EXPR using ARGS, where CTX_PARMS
+   are the template parameters in scope for EXPR.  */
  
  static tree

-build_parameter_mapping (tree expr, tree args, tree decl)
+build_parameter_mapping (tree expr, tree args, tree ctx_parms)
  {
-  tree ctx_parms = NULL_TREE;
-  if (decl)
-{
-  gcc_assert (TREE_CODE (decl) == TEMPLATE_DECL);
-  ctx_parms = DECL_TEMPLATE_PARMS (decl);
-}
-  else if (current_template_parms)
-{
-  /* TODO: This should probably be the only case, but because the
-point of declaration of concepts is currently set after the
-initializer, the template parameter lists are not available
-when normalizing concept definitions, hence the case above.  */
-  ctx_parms = current_template_parms;
-}
-
tree parms = find_template_parameters (expr, ctx_parms);
tree map = map_arguments (parms, args);
return map;
@@ -645,53 +631,63 @@ parameter_mapping_equivalent_p (tree t1, tree t2)
  
  struct norm_info : subst_info

  {
-  explicit norm_info (tsubst_flags_t complain)
-:

[PATCH] c++: Endless loop with targ deduction in member tmpl [PR95888]

2021-02-09 Thread Marek Polacek via Gcc-patches

My r10-7007 patch tweaked tsubst not to reduce the template level of
template parameters when tf_partial.  That caused infinite looping in
is_specialization_of: we ended up with a class template specialization
whose TREE_TYPE (CLASSTYPE_TI_TEMPLATE (t)) == t, so the second for
loop in is_specialization_of never finished.

There's a lot going on in this test, but essentially: the template fn
here has two template parameters, we call it with one explicitly
provided, the other one has to be deduced.  So we'll find ourselves
in fn_type_unification which uses tf_partial when tsubsting the
*explicit* template arguments into the function type.  That leads to
tsubstituting the return type, C.  C is a member template; its
most general template is

  template template struct B::C

we figure out (tsubst_template_args) that the template argument list
is .  They come from different levels, one comes from B,
the other one from fn.

So now we lookup_template_class to see if we have C.  We
do the
  /* This is a full instantiation of a member template.  Find
 the partial instantiation of which this is an instance.  */
  TREE_VEC_LENGTH (arglist)--;
  // arglist is now , not 
  found = tsubst (gen_tmpl, arglist, complain, NULL_TREE);
  TREE_VEC_LENGTH (arglist)++;

magic which is looking for the partial instantiation, in this case,
that would be template struct B::C.  Note we're still
in a tf_partial context!  So the tsubst_template_args in the tsubst
(which tries to substitute  into ) returns , but
V's template level hasn't been reduced!  After tsubst_template_args,
tsubst_template_decl looks to see if we already have this specialization:

  // t = template_decl C
  // full_args = 
  spec = retrieve_specialization (t, full_args, hash);

but doesn't find the one we created a while ago, when processing
B b; in the test, because V's levels don't match.  Whereupon
tsubst_template_decl creates a new TEMPLATE_DECL, one that leads to
the infinite looping problem.

I think let's clear tf_partial when looking for an existing partial
instantiation.

It also occurred to me that I should be able to trigger a similar
problem with 'auto', since r10-7007 removed an is_auto check.  And lo,
I constructed deduce10.C which exhibits the same issue with pre-r10-7007
compilers.  This patch fixes that problem as well.  I'm ecstatic.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/10?  Also
built cmcstl2.

gcc/cp/ChangeLog:

PR c++/95888
* pt.c (lookup_template_class_1): Clear tf_partial when looking for
the partial instantiation.

gcc/testsuite/ChangeLog:

PR c++/95888
* g++.dg/template/deduce10.C: New test.
* g++.dg/template/deduce9.C: New test.
---
 gcc/cp/pt.c  | 10 +-
 gcc/testsuite/g++.dg/template/deduce10.C | 23 +++
 gcc/testsuite/g++.dg/template/deduce9.C  | 23 +++
 3 files changed, 55 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/template/deduce10.C
 create mode 100644 gcc/testsuite/g++.dg/template/deduce9.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index f73deb3aee3..5b8a08a30e0 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -10137,7 +10137,15 @@ lookup_template_class_1 (tree d1, tree arglist, tree 
in_decl, tree context,
  /* Temporarily reduce by one the number of levels in the ARGLIST
 so as to avoid comparing the last set of arguments.  */
  TREE_VEC_LENGTH (arglist)--;
- found = tsubst (gen_tmpl, arglist, complain, NULL_TREE);
+ /* Clear tf_partial, which can be set because we might be at the
+beginning of template argument deduction when any explicitly
+specified template arguments are substituted into the function
+type.  If we don't clear tf_partial, we won't find the partial
+instantiation that might have been created outside tf_partial
+context, because the levels of template parameters wouldn't
+match, because in a tf_partial context, tsubst doesn't reduce
+TEMPLATE_PARM_LEVEL.  */
+ found = tsubst (gen_tmpl, arglist, complain & ~tf_partial, NULL_TREE);
  TREE_VEC_LENGTH (arglist)++;
  /* FOUND is either a proper class type, or an alias
 template specialization.  In the later case, it's a
diff --git a/gcc/testsuite/g++.dg/template/deduce10.C 
b/gcc/testsuite/g++.dg/template/deduce10.C
new file mode 100644
index 000..165ff195728
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/deduce10.C
@@ -0,0 +1,23 @@
+// PR c++/95888
+// { dg-do compile { target c++17 } }
+
+template  class A {
+  A(int, int);
+  template  friend class A;
+  friend T;
+};
+
+template struct B {
+  template struct C {
+A begin() { return {1, 0}; }
+  };
+  template
+  C fn();
+};
+
+int
+main ()
+{
+  B b;
+  b.fn<1>().begin();
+}
diff --git a/gcc/testsuite/g++.dg/template/deduce9.C

Re: [Patch] Fortran: intrinsic.texi add missing arg to FINDLOC (was: FINDLOC documentation at https://gcc.gnu.org/onlinedocs/gfortran/FINDLOC.html)


Hi Thomas,

On 09.02.21 22:54, Thomas Koenig via Fortran wrote:

Just one nit: MASK does not have to be an array, it can
also be a scalar. It just has to be conformable.


Well spotted. However, that also applies to {MAX,MIN}{LOC,VAL}. Hence, I 
updated those as well.


I will commit the patch tomorrow.

Thanks,

Tobias

Fortran: intrinsic.texi add missing arg to FINDLOC

gcc/fortran/ChangeLog:

	* intrinsic.texi (FINDLOC): Add 'MASK' to argument table.
	(MAXLOC, MAXVAL, MINLOC, MINVAL): For 'MASK', remove 'an
	array' as scalars are also permitted.

diff --git a/gcc/fortran/intrinsic.texi b/gcc/fortran/intrinsic.texi
index 63416bce7fd..ea78f7d343a 100644
--- a/gcc/fortran/intrinsic.texi
+++ b/gcc/fortran/intrinsic.texi
@@ -6188,6 +6188,8 @@ conformance with @var{ARRAY}.
 @item @var{DIM} @tab (Optional) Shall be a scalar of type
 @code{INTEGER}, with a value between one and the rank of @var{ARRAY},
 inclusive.  It may not be an optional dummy argument.
+@item @var{MASK} @tab (Optional) Shall be of type @code{LOGICAL},
+and conformable with @var{ARRAY}.
 @item @var{KIND} @tab (Optional) An @code{INTEGER} initialization
 expression indicating the kind parameter of the result.
 @item @var{BACK} @tab (Optional) A scalar of type @code{LOGICAL}.
@@ -10356,7 +10358,7 @@ Transformational function
 @item @var{DIM}   @tab (Optional) Shall be a scalar of type
 @code{INTEGER}, with a value between one and the rank of @var{ARRAY},
 inclusive.  It may not be an optional dummy argument.
-@item @var{MASK}  @tab Shall be an array of type @code{LOGICAL},
+@item @var{MASK}  @tab Shall be of type @code{LOGICAL},
 and conformable with @var{ARRAY}.
 @item @var{KIND} @tab (Optional) An @code{INTEGER} initialization
 expression indicating the kind parameter of the result.
@@ -10417,7 +10419,7 @@ Transformational function
 @item @var{DIM}   @tab (Optional) Shall be a scalar of type
 @code{INTEGER}, with a value between one and the rank of @var{ARRAY},
 inclusive.  It may not be an optional dummy argument.
-@item @var{MASK}  @tab (Optional) Shall be an array of type @code{LOGICAL},
+@item @var{MASK}  @tab (Optional) Shall be of type @code{LOGICAL},
 and conformable with @var{ARRAY}.
 @end multitable
 
@@ -10723,7 +10725,7 @@ Transformational function
 @item @var{DIM}   @tab (Optional) Shall be a scalar of type
 @code{INTEGER}, with a value between one and the rank of @var{ARRAY},
 inclusive.  It may not be an optional dummy argument.
-@item @var{MASK}  @tab Shall be an array of type @code{LOGICAL},
+@item @var{MASK}  @tab Shall be of type @code{LOGICAL},
 and conformable with @var{ARRAY}.
 @item @var{KIND} @tab (Optional) An @code{INTEGER} initialization
 expression indicating the kind parameter of the result.
@@ -10784,7 +10786,7 @@ Transformational function
 @item @var{DIM}   @tab (Optional) Shall be a scalar of type
 @code{INTEGER}, with a value between one and the rank of @var{ARRAY},
 inclusive.  It may not be an optional dummy argument.
-@item @var{MASK}  @tab Shall be an array of type @code{LOGICAL},
+@item @var{MASK}  @tab Shall be of type @code{LOGICAL},
 and conformable with @var{ARRAY}.
 @end multitable

Re: [PATCH] c++: Fix ICE from op_unqualified_lookup [PR97582]

2021-02-09 Thread Patrick Palka via Gcc-patches

On Tue, 2 Feb 2021, Jason Merrill wrote:

> On 2/2/21 12:19 AM, Patrick Palka wrote:
> > In this testcase, we're crashing because the lookup of operator+ from
> > within the generic lambda via lookup_name finds multiple bindings
> > (namely C1::operator+ and C2::operator+) and returns a TREE_LIST
> > thereof, something which maybe_save_operator_binding isn't prepared to
> > handle.
> > 
> > Since we already discard the result of lookup_name when it returns a
> > class-scope binding here, it seems cleaner (and equivalent) to instead
> > communicate to lookup_name that we don't want such bindings in the first
> > place.  While this change seems like an improvement on its own, it also
> > fixes the mentioned PR, because the call to lookup_name now returns
> > NULL_TREE rather than a TREE_LIST of (unwanted) class-scope bindings.
> > 
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> > trunk/9/10?
> > 
> > gcc/cp/ChangeLog:
> > 
> > PR c++/97582
> > * name-lookup.c (op_unqualified_lookup): Pass BLOCK_NAMESPACE to
> > lookup_name in order to ignore class-scope bindings, rather
> > than discarding them after the fact.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > PR c++/97582
> > * g++.dg/cpp0x/lambda/lambda-template17.C: New test.
> > ---
> >   gcc/cp/name-lookup.c  | 11 +++
> >   gcc/testsuite/g++.dg/cpp0x/lambda/lambda-template17.C |  8 
> >   2 files changed, 11 insertions(+), 8 deletions(-)
> >   create mode 100644 gcc/testsuite/g++.dg/cpp0x/lambda/lambda-template17.C
> > 
> > diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
> > index 52e4a630e25..46d6cc0dfa4 100644
> > --- a/gcc/cp/name-lookup.c
> > +++ b/gcc/cp/name-lookup.c
> > @@ -9213,17 +9213,12 @@ op_unqualified_lookup (tree fnname)
> > return NULL_TREE;
> >   }
> >   -  tree fns = lookup_name (fnname);
> > +  /* We don't need to remember class-scope functions or declarations,
> > + normal unqualified lookup will find them again.  */
> > +  tree fns = lookup_name (fnname, LOOK_where::BLOCK_NAMESPACE);
> 
> Hmm, I'd expect this to look past class-scope declarations to find
> namespace-scope declarations, but we want class decls to hide decls in an
> outer scope.

D'oh, good point.  But IIUC, even if we did return (and later inject at
instantiation time) namespace-scope declarations that were hidden by
class-scope declarations, wouldn't the lookup at instantiation time
still find and prefer the class-scope bindings (as desired)?  It seems
to me that the end result might be the same, but I'm not sure.

Alternatively, would it be safe to assume that if lookup_name returns an
ambiguous result, then the result must consist of class-scope
declarations and so we can discard it?

> 
> > if (!fns)
> >   /* Remember we found nothing!  */
> >   return error_mark_node;
> > -
> > -  tree d = is_overloaded_fn (fns) ? get_first_fn (fns) : fns;
> > -  if (DECL_CLASS_SCOPE_P (d))
> > -/* We don't need to remember class-scope functions or declarations,
> > -   normal unqualified lookup will find them again.  */
> > -fns = NULL_TREE;
> > -
> > return fns;
> >   }
> >   diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-template17.C
> > b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-template17.C
> > new file mode 100644
> > index 000..6cafbab8cb0
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-template17.C
> > @@ -0,0 +1,8 @@
> > +// PR c++/97582
> > +// { dg-do compile { target c++11 } }
> > +
> > +struct C1 { void operator+(); };
> > +struct C2 { void operator+(); };
> > +struct C3 : C1, C2 {
> > +  template  void get() { [] (T x) { +x; }; }
> > +};
> > 
> 
>

Re: [Patch] Fortran: intrinsic.texi add missing arg to FINDLOC (was: FINDLOC documentation at https://gcc.gnu.org/onlinedocs/gfortran/FINDLOC.html)

2021-02-09 Thread Thomas Koenig via Gcc-patches


Hi Tobias,


I intent commit it as obvious, unless there are comments.


Just one nit: MASK does not have to be an array, it can
also be a scalar. It just has to be conformable.

OK with that change.

Best regards

Thomas

Re: [PATCH] adjust "partly out of bounds" warning (PR 98503)


On 2/8/21 4:26 PM, Jeff Law wrote:



On 2/8/21 4:07 PM, Martin Sebor wrote:

On 2/8/21 12:09 PM, Jeff Law wrote:



On 2/3/21 3:45 PM, Martin Sebor wrote:

On 2/3/21 2:57 PM, Jeff Law wrote:



On 1/28/21 4:03 PM, Martin Sebor wrote:

The GCC 11 -Warray-bounds enhancement to diagnose accesses whose
leading offset is in bounds but whose trailing offset is not has
been causing some confusion.  When the warning is issued for
an access to an in-bounds member via a pointer to a struct that's
larger than the pointed-to object, phrasing this strictly invalid
access in terms of array subscripts can be misleading, especially
when the source code doesn't involve any arrays or indexing.

Since the problem boils down to an aliasing violation much more
so than an actual out-of-bounds access, the attached patch adjusts
the code to issue a -Wstrict-aliasing warning in these cases instead
of -Warray-bounds.  In addition, since the aliasing assumptions in
GCC can be disabled by -fno-strict-aliasing, the patch also makes
these instances of the warning conditional on -fstrict-aliasing
being in effect.

Martin

gcc-98503.diff

PR middle-end/98503 -Warray-bounds when -Wstrict-aliasing would be
more appropriate

gcc/ChangeLog:

  PR middle-end/98503
  * gimple-array-bounds.cc (array_bounds_checker::check_mem_ref):
  Issue -Wstrict-aliasing for a subset of violations.
  (array_bounds_checker::check_array_bounds):  Set new member.
  * gimple-array-bounds.h (array_bounds_checker::cref_of_mref):
New
  data member.

gcc/testsuite/ChangeLog:

  PR middle-end/98503
  * g++.dg/warn/Warray-bounds-10.C: Adjust text of expected
warnings.
  * g++.dg/warn/Warray-bounds-11.C: Same.
  * g++.dg/warn/Warray-bounds-12.C: Same.
  * g++.dg/warn/Warray-bounds-13.C: Same.
  * gcc.dg/Warray-bounds-63.c: Avoid -Wstrict-aliasing.  Adjust
text
  of expected warnings.
  * gcc.dg/Warray-bounds-66.c: Adjust text of expected warnings.
  * gcc.dg/Wstrict-aliasing-2.c: New test.
  * gcc.dg/Wstrict-aliasing-3.c: New test.

What I don't like here is the strict-aliasing warnings inside the file
and analysis phase for array bounds checking.

ISTM that catching this at cast time would be better.  So perhaps in
build_c_cast and the C++ equivalent?

It would mean we're warning at the cast site rather than the
dereference, but that may ultimately be better for the warning anyway.
I'm not sure.


I had actually experimented with a this (in the middle end, and only
for accesses) but even that caused so many warnings that I abandoned
it, at least for now.  -Warray-bounds has the benefit of flow analysis
(and dead code elimination).  In the front end it would have neither
and be both excessively noisy and ineffective at the same time.  For
example:

I think we agree that this really is an aliasing issue and I would go
further and claim that it has nothing to do with array bounds checking.
Therefore warning for it within gimple-array-bounds just seems wrong.

I realize that you like having DCE run and the ability to look at
offsets and such, but it really feels like the wrong place to do this.
Aliasing issues are also more of a front-end thing since the language
defines what is and what is not valid aliasing -- one might even argue
that any aliasing warning needs to be identified by the front-ends
exclusively (though possibly pruned away later).


The aliasing restrictions are on accesses, which are [defined in
C as] execution-time actions on objects.  Analyzing execution-time
actions unavoidably depends on flow analysis which the front ends
have only very limited support for (simple expressions only).
I gave one example showing how the current -Wstrict-aliasing in
the front end is ineffective against all but the most simplistic
bugs, but there are many more.  For instance:

I'm aware of all this.  Even so I think trying to deal with strict
aliasing in the middle-end is fundamentally wrong.  The middle end is
not C and it can not assume C semantics and putting the warning in the
middle of the array bounds pass is, IMHO, just wrong.


All the existing access warnings in the middle end are designed
around C/C++ (and other language) rules, just as much as all middle
end optimizations, including aliasing, rely on them.  I don't see
how this instance is any different.

-Warray-bounds is issued in the VRP pass (there is no dedicated
array bounds pass).  It just happens to live in a file recently
named gimple-array-bounds.cc.  The name was chosen only because
that was the only warning that happened to be implemented there
at the time.  I have no problem with keeping separate things
separate but I also see nothing wrong with issuing two or more
warnings under different options from the same code/pass,
especially when they're related or when both employ the same
analysis.  Some bugs are the results of violating multiple rules
and which one is chosen in each instance as the most representative
is a matter of

[Patch] Fortran: intrinsic.texi add missing arg to FINDLOC (was: FINDLOC documentation at https://gcc.gnu.org/onlinedocs/gfortran/FINDLOC.html)


Hi Kay, hi all,

On 09.02.21 21:41, Kay Diederichs wrote:

the above link does not document the optional MASK argument. Could somebody fix 
this, please?

Thanks for the comment.

Findloc's mask argument is documented in the text ("If @var{MASK} is 
present, only the elements for which @var{MASK} is @code{.TRUE.} are 
considered.") – but missing from the argument table. Thus, the attached 
patch should be sufficient, shouldn't it?


I intent commit it as obvious, unless there are comments.

Tobias

Fortran: intrinsic.texi add missing arg to FINDLOC

gcc/fortran/ChangeLog:

	* intrinsic.texi (FINDLOC): Add 'MASK' to argument table.

diff --git a/gcc/fortran/intrinsic.texi b/gcc/fortran/intrinsic.texi
index 63416bce7fd..4a5172a835c 100644
--- a/gcc/fortran/intrinsic.texi
+++ b/gcc/fortran/intrinsic.texi
@@ -6188,6 +6188,8 @@ conformance with @var{ARRAY}.
 @item @var{DIM} @tab (Optional) Shall be a scalar of type
 @code{INTEGER}, with a value between one and the rank of @var{ARRAY},
 inclusive.  It may not be an optional dummy argument.
+@item @var{MASK} @tab (Optional) Shall be an array of type @code{LOGICAL},
+and conformable with @var{ARRAY}.
 @item @var{KIND} @tab (Optional) An @code{INTEGER} initialization
 expression indicating the kind parameter of the result.
 @item @var{BACK} @tab (Optional) A scalar of type @code{LOGICAL}.

Re: [PATCH] c++: Consider addresses of heap artificial vars always non-NULL [PR98988]


On 2/9/21 3:11 PM, Jakub Jelinek wrote:

On Tue, Feb 09, 2021 at 03:00:20PM -0500, Jason Merrill wrote:

With -fno-delete-null-pointer-checks which is e.g. implied by
-fsanitize=undefined or default on some embedded targets, the middle-end
folder doesn't consider addresses of global VAR_DECLs to be non-NULL, as one
of them could have address 0.


Hmm, are these VAR_DECLs going into the symtab?  That seems undesirable.


They are not in the symtab from the C++ FE.  And we don't allow those VAR_DECLs 
or their
addresses to leak into the IL anyway.
But, they are TREE_STATIC because pretending they are automatic didn't
really work (e.g. address of automatic variables is not a constant expression).
The generic folding code uses maybe_nonzero_address predicate which
apprently creates a symtab entry for them but as nothing uses them they
won't be emitted.


Ah.


If we wanted to avoid that, we'd need to either add some generic VAR_DECL
bit for those vars or add a target hook that would return true for them and
then handle them in the middle-end.


I'm surprised that nonzero_address has such a limited set of things it 
will actually believe have non-zero addresses with 
-fno-delete-null-pointer-checks.  But it seems that we should be able to 
arrange to satisfy



  if (definition && !DECL_EXTERNAL (decl)


since these "variables" are indeed defined within the current 
translation unit.


Jason

[PATCH] Fix version namespace build

2021-02-09 Thread François Dumont via Gcc-patches


    libstdc++: Fix versioned namespace build

    libstdc++-v3/ChangeLog:

    * libsupc++/eh_ptr.cc (exception_ptr(__safe_bool)): Define if
    _GLIBCXX_EH_PTR_COMPAT is defined.
    (exception_ptr::_M_safe_bool_dummy()): Likewise.
    (exception_ptr::operator!()): Likewise.
    (exception_ptr::opeerator __safe_bool()): Likewise.

Ok to commit ?

François

diff --git a/libstdc++-v3/libsupc++/eh_ptr.cc b/libstdc++-v3/libsupc++/eh_ptr.cc
index 5d8ac1d879a..80999f52795 100644
--- a/libstdc++-v3/libsupc++/eh_ptr.cc
+++ b/libstdc++-v3/libsupc++/eh_ptr.cc
@@ -72,9 +72,12 @@ std::__exception_ptr::exception_ptr::exception_ptr(void* obj) noexcept
 : _M_exception_object(obj)  { _M_addref(); }
 
 
+#ifdef _GLIBCXX_EH_PTR_COMPAT
+
 std::__exception_ptr::exception_ptr::exception_ptr(__safe_bool) noexcept
 : _M_exception_object(nullptr) { }
 
+#endif
 
 void
 std::__exception_ptr::exception_ptr::_M_addref() noexcept
@@ -112,6 +115,7 @@ std::__exception_ptr::exception_ptr::_M_get() const noexcept
 { return _M_exception_object; }
 
 
+#ifdef _GLIBCXX_EH_PTR_COMPAT
 
 // Retained for compatibility with CXXABI_1.3.
 void
@@ -130,6 +134,8 @@ std::__exception_ptr::exception_ptr::operator __safe_bool() const noexcept
   return _M_exception_object ? _ptr::_M_safe_bool_dummy : nullptr;
 }
 
+#endif
+
 const std::type_info*
 std::__exception_ptr::exception_ptr::__cxa_exception_type() const noexcept
 {

[committed 2/2] analyzer: support "_IO_"-prefixed variants of FILE * fns [PR98575]

2021-02-09 Thread David Malcolm via Gcc-patches

PR analyzer/98575 describes an unexpected -Wanalyzer-malloc-leak false
positive from gcc.dg/analyzer/pr94851-1.c on glibc < 2.28.

The issue is that a getchar call gets inlined into a call to _IO_getc,
and "_IO_getc" is not in the set of FILE * functions the analyzer
"knows about".  This exposes a bug in memory leak detection on code
paths in which an unknown function has been called.

The memory leak bug is fixed in the prior commit, but for good
measure this patch special-cases the "_IO_"-prefixed names in glibc
so that the analyzer can reuse its knowledge about the unprefixed
variants.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r11-7156-g790a8e8942b3f5a896ab5897cd209de1d9c382ae.

gcc/analyzer/ChangeLog:
PR analyzer/98575
* sm-file.cc (is_file_using_fn_p): Support "_IO_"-prefixed
variants.

gcc/testsuite/ChangeLog:
PR analyzer/98575
* gcc.dg/analyzer/file-1.c (test_5): New.
* gcc.dg/analyzer/file-3.c: New test.
---
 gcc/analyzer/sm-file.cc| 11 ++-
 gcc/testsuite/gcc.dg/analyzer/file-1.c |  7 +++
 gcc/testsuite/gcc.dg/analyzer/file-3.c | 18 ++
 3 files changed, 35 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/file-3.c

diff --git a/gcc/analyzer/sm-file.cc b/gcc/analyzer/sm-file.cc
index aaa7ab28775..48ef4aa2334 100644
--- a/gcc/analyzer/sm-file.cc
+++ b/gcc/analyzer/sm-file.cc
@@ -307,7 +307,16 @@ static bool
 is_file_using_fn_p (tree fndecl)
 {
   function_set fs = get_file_using_fns ();
-  return fs.contains_decl_p (fndecl);
+  if (fs.contains_decl_p (fndecl))
+return true;
+
+  /* Also support variants of these names prefixed with "_IO_".  */
+  const char *name = IDENTIFIER_POINTER (DECL_NAME (fndecl));
+  if (strncmp (name, "_IO_", 4) == 0)
+if (fs.contains_name_p (name + 4))
+  return true;
+
+  return false;
 }
 
 /* Implementation of state_machine::on_stmt vfunc for fileptr_state_machine.  
*/
diff --git a/gcc/testsuite/gcc.dg/analyzer/file-1.c 
b/gcc/testsuite/gcc.dg/analyzer/file-1.c
index f2b77b9db66..f9afa88f1e8 100644
--- a/gcc/testsuite/gcc.dg/analyzer/file-1.c
+++ b/gcc/testsuite/gcc.dg/analyzer/file-1.c
@@ -47,3 +47,10 @@ test_4 (const char *path)
 
   return; /* { dg-warning "leak of FILE 'f'" } */ 
 }
+
+void
+test_5 (const char *path)
+{
+  FILE *f = fopen (path, "r"); /* { dg-message "opened here" } */
+  return; /* { dg-warning "leak of FILE 'f'" } */ 
+}
diff --git a/gcc/testsuite/gcc.dg/analyzer/file-3.c 
b/gcc/testsuite/gcc.dg/analyzer/file-3.c
new file mode 100644
index 000..8f93a986deb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/file-3.c
@@ -0,0 +1,18 @@
+typedef struct _IO_FILE FILE;
+extern struct _IO_FILE *stderr;
+
+extern FILE *fopen (const char *__restrict __filename,
+   const char *__restrict __modes);
+extern int _IO_getc (FILE *stream);
+
+void
+test_1 (const char *path)
+{
+  FILE *f = fopen (path, "r"); /* { dg-message "opened here" } */
+
+  /* Implementation of getc in glibc < 2.28.
+ Verify that we know that this doesn't close the file.  */
+  _IO_getc (f);
+
+  return; /* { dg-warning "leak of FILE 'f'" } */ 
+}
-- 
2.26.2

[committed 1/2] analyzer: treat pointers written to *UNKNOWN as escaping [PR98575]

2021-02-09 Thread David Malcolm via Gcc-patches

PR analyzer/98575 describes an unexpected -Wanalyzer-malloc-leak false
positive from gcc.dg/analyzer/pr94851-1.c on glibc < 2.28.

The issue is that a getchar call gets inlined into a call to _IO_getc,
and "_IO_getc" is not in the set of FILE * functions the analyzer
"knows about".  This leads to a global pointer
  struct buf *curbp;
being treated as UNKNOWN after the call to _IO_getc.  Later when a
malloced pointer is written to curbp->b_amark, the write is discarded
(since curbp is unknown) without noting that the pointer has escaped,
and so the pointer is erroneously treated as leaking when the function
returns.

This patch updates the handling of *UNKNOWN to treat pointers written
to them as having escaped, fixing the false positive.

The patch stops the leak warning in gcc.dg/analyzer/explode-1.c.
After merging states at the join-point after the first switch, pp has
UNKNOWN value, and so *pp is a write through UNKNOWN, which with this
patch is now treated as escaping - despite the fact that all possible
values for *pp are on the stack.  There doesn't seem to be a good way
to fix this, and the testcase is an artifically constructed one, so the
patch simply removes the dg-warning directive.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Verified the fix on gcc135.fsffrance.org.
Pushed to trunk as r11-7155-g1d9f3b7ad4f965a0acc21d42cb2d186ecd065b71.

gcc/analyzer/ChangeLog:
PR analyzer/98575
* store.cc (store::set_value): Treat a pointer written to *UNKNOWN
as having escaped.

gcc/testsuite/ChangeLog:
PR analyzer/98575
* gcc.dg/analyzer/explode-1.c: Remove expected leak warning.
* gcc.dg/analyzer/pr94851-2.c: New test.
* gcc.dg/analyzer/pr98575-1.c: New test.
---
 gcc/analyzer/store.cc | 17 +--
 gcc/testsuite/gcc.dg/analyzer/explode-1.c |  2 +-
 gcc/testsuite/gcc.dg/analyzer/pr94851-2.c | 54 +++
 gcc/testsuite/gcc.dg/analyzer/pr98575-1.c | 46 +++
 4 files changed, 115 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr94851-2.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr98575-1.c

diff --git a/gcc/analyzer/store.cc b/gcc/analyzer/store.cc
index abdb336da91..da5b5adb5b4 100644
--- a/gcc/analyzer/store.cc
+++ b/gcc/analyzer/store.cc
@@ -1820,9 +1820,20 @@ store::set_value (store_manager *mgr, const region 
*lhs_reg,
   const region *lhs_base_reg = lhs_reg->get_base_region ();
   binding_cluster *lhs_cluster;
   if (lhs_base_reg->symbolic_for_unknown_ptr_p ())
-/* Reject attempting to bind values into a symbolic region
-   for an unknown ptr; merely invalidate values below.  */
-lhs_cluster = NULL;
+{
+  /* Reject attempting to bind values into a symbolic region
+for an unknown ptr; merely invalidate values below.  */
+  lhs_cluster = NULL;
+
+  /* The LHS of the write is *UNKNOWN.  If the RHS is a pointer,
+then treat the region being pointed to as having escaped.  */
+  if (const region_svalue *ptr_sval = rhs_sval->dyn_cast_region_svalue ())
+   {
+ const region *ptr_dst = ptr_sval->get_pointee ();
+ const region *ptr_base_reg = ptr_dst->get_base_region ();
+ mark_as_escaped (ptr_base_reg);
+   }
+}
   else
 {
   lhs_cluster = get_or_create_cluster (lhs_base_reg);
diff --git a/gcc/testsuite/gcc.dg/analyzer/explode-1.c 
b/gcc/testsuite/gcc.dg/analyzer/explode-1.c
index 9b95afd9a03..6b62e8e871c 100644
--- a/gcc/testsuite/gcc.dg/analyzer/explode-1.c
+++ b/gcc/testsuite/gcc.dg/analyzer/explode-1.c
@@ -47,7 +47,7 @@ void test (void)
{
default:
case 0:
- *pp = malloc (16); /* { dg-warning "leak" } */
+ *pp = malloc (16);
  break;
case 1:
  free (*pp);
diff --git a/gcc/testsuite/gcc.dg/analyzer/pr94851-2.c 
b/gcc/testsuite/gcc.dg/analyzer/pr94851-2.c
new file mode 100644
index 000..60947216b7f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/pr94851-2.c
@@ -0,0 +1,54 @@
+/* As pr94851-1.c, but verify that we don't get confused by a call to
+   an unknown function (PR analyzer/98575).  */
+
+/* { dg-additional-options "-O2" } */
+
+#include 
+#include 
+
+typedef struct AMARK {
+  struct AMARK *m_next;
+  char m_name;
+} AMARK;
+
+struct buf {
+  AMARK *b_amark;
+};
+
+struct buf *curbp;
+
+extern void unknown_fn (void);
+
+int pamark(void) {
+  int c;
+
+  AMARK *p = curbp->b_amark;
+  AMARK *last = curbp->b_amark;
+
+  unknown_fn ();
+
+  c = getchar ();
+
+  while (p != (AMARK *)NULL && p->m_name != (char)c) {
+last = p;
+p = p->m_next;
+  }
+
+  if (p != (AMARK *)NULL) {
+printf("over writing mark %c\n", c);
+  } else {
+if ((p = (AMARK *)malloc(sizeof(AMARK))) == (AMARK *)NULL)
+  return 0;
+
+p->m_next = (AMARK *)NULL;
+
+if (curbp->b_amark == (AMARK *)NULL)
+  curbp->b_amark = p;
+else
+  last->m_next = p;
+  }
+
+  p->m_name =

Re: [PATCH] c++: Consider addresses of heap artificial vars always non-NULL [PR98988]

On Tue, Feb 09, 2021 at 03:00:20PM -0500, Jason Merrill wrote:
> > With -fno-delete-null-pointer-checks which is e.g. implied by
> > -fsanitize=undefined or default on some embedded targets, the middle-end
> > folder doesn't consider addresses of global VAR_DECLs to be non-NULL, as one
> > of them could have address 0.
> 
> Hmm, are these VAR_DECLs going into the symtab?  That seems undesirable.

They are not in the symtab from the C++ FE.  And we don't allow those VAR_DECLs 
or their
addresses to leak into the IL anyway.
But, they are TREE_STATIC because pretending they are automatic didn't
really work (e.g. address of automatic variables is not a constant expression).
The generic folding code uses maybe_nonzero_address predicate which
apprently creates a symtab entry for them but as nothing uses them they
won't be emitted.
If we wanted to avoid that, we'd need to either add some generic VAR_DECL
bit for those vars or add a target hook that would return true for them and
then handle them in the middle-end.

Jakub

Re: [PATCH] c++: Private parent access check for using decls [PR19377]

On 2/9/21 6:18 AM, Anthony Sharp wrote:

The parens are to help the editor line up the last line properly.  If
the subexpression fits on one line, the parens aren't needed.

A okay.

No; "compile" means translate from C++ to assembly, "assemble" means
that, plus call the assembler to produce an object file.  Though since
compilation errors out, assembling never happens.

lol I was being dumb and thinking it was the other way around. I will 
change it.

   You could bring back your lookup from the earlier patch and look
through the result to see if the function we're complaining about is
part of what the particular using-decl brings in.

I will give that a try.

I think you want
SAME_BINFO_TYPE_P (context_for_name_lookup (decl),
                     BINFO_TYPE (parent_binfo))

Am I reading this wrong then? :

/* Compare a BINFO_TYPE with another type for equality.
For a binfo, this is functionally equivalent to using same_type_p, but 
measurably faster.
  At least one of the arguments must be a BINFO_TYPE.  The other can be 
a BINFO_TYPE
or a regular type.  If BINFO_TYPE(T) ever stops being the main variant 
of the class the

binfo is for, this macro must change.  */
#define SAME_BINFO_TYPE_P(A, B) ((A) == (B))

That leads me to believe that arguments A and B can be: BINFO, BINFO ... 
or BINFO, TYPE ... or TYPE, BINFO. If so I don't see why this doesn't work:

(SAME_BINFO_TYPE_P (context_for_name_lookup (decl), parent_binfo))
and
(SAME_BINFO_TYPE_P (TYPE_BINFO (context_for_name_lookup (decl)), 
parent_binfo))

Unless "BINFO_TYPE" is actually just how you refer to a regular class 
type and not a BINFO. Seems a bit confusing to me.

The comment is pretty conusing, I agree.  But what it's saying is that 
both arguments must be types, and one of those types must be BINFO_TYPE 
(some_binfo).

Your first line doesn't work because you're comparing a type and a 
binfo.  The second one doesn't work because you're comparing the binfo 
for a most-derived object of the type to the binfo for a base subobject 
of the same type, and those are different, because binfos represent 
nodes in inheritance graphs.

This line needs one more space.

Oh I see ... so second line's arguments need to line up with the first 
argument, not the first (.

I will send over a new patch later/tomorrow.

Anthony

On Tue, 9 Feb 2021 at 04:55, Jason Merrill > wrote:

On 2/5/21 12:28 PM, Anthony Sharp wrote:
 > Hi Marek,
 >
 >>> +      if ((TREE_CODE (parent_field) == USING_DECL)
 >>
 >> This first term doesn't need to be wrapped in ().
 >
 > I normally wouldn't use the (), but I think that's how Jason likes it
 > so that's why I did it. I guess it makes it more readable.

Ah, no, I don't see any need for the extra () there.  When I asked for
added parens previously:

 >> +       if (parent_binfo != NULL_TREE
 >> +           && context_for_name_lookup (decl)
 >> +           != BINFO_TYPE (parent_binfo))
 >
 > Here you want parens around the second != expression and its != token
 > aligned with "context"

The parens are to help the editor line up the last line properly.  If
the subexpression fits on one line, the parens aren't needed.

 >> We usually use dg-do compile.
 >
 > True, but wouldn't that technically be slower? I would agree if it
 > were more than just error-handling code.

No; "compile" means translate from C++ to assembly, "assemble" means
that, plus call the assembler to produce an object file.  Though since
compilation errors out, assembling never happens.

 > +      if ((TREE_CODE (parent_field) == USING_DECL)
 > +        && TREE_PRIVATE (parent_field)
 > +        && (DECL_NAME (decl) == DECL_NAME (parent_field)))
 > +       return parent_field;

Following our discussion about an earlier revision, this will often
produce the right using-declaration, but might not if two functions of
the same name are imported from different bases.  If I split the
using-declaration in using9.C into two, i.e.

 >   using A2::i; // { dg-message "declared" }
 >   using A::i;

then I get

 > wa.ii: In member function ‘void C::f()’:
 > wa.ii:28:7: error: ‘int A::i()’ is private within this context
 >    28 |     i();
 >       |       ^
 > wa.ii:20:13: note: declared private here
 >    20 |   using A2::i;

pointing out the wrong using-declaration because it happens to be
first.
   You could bring back your lookup from the earlier patch and look
through the result to see if the function we're complaining about is
part of what the particular using-decl brings in.

 > I tried both:
 > (SAME_BINFO_TYPE_P (context_for_name_lookup (decl), parent_binfo))
 > and
 > (SAME_BINFO_TYPE_P (TYPE_BINFO (context_for_name_lookup (decl)),
parent_binfo))

I think you

Re: [PATCH] c++: Consider addresses of heap artificial vars always non-NULL [PR98988]


On 2/9/21 3:25 AM, Jakub Jelinek wrote:

Hi!

With -fno-delete-null-pointer-checks which is e.g. implied by
-fsanitize=undefined or default on some embedded targets, the middle-end
folder doesn't consider addresses of global VAR_DECLs to be non-NULL, as one
of them could have address 0.


Hmm, are these VAR_DECLs going into the symtab?  That seems undesirable.


Still, I think malloc/operator new (at least
the nonthrowing) relies on NULL returns meaning allocation failure rather
than success.  Furthermore, the artificial VAR_DECLs we create for
constexpr new never actually live in the address space of the program,
so we can pretend they will never be NULL too.



The following patch does that, so that one can actually use constexpr
new/delete even with -fno-delete-null-pointer-checks or sanitizers
- otherwise delete itself tests whether the passed address is non-NULL and
that wouldn't fold to true if the pointer has been allocated with constexpr
new.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-02-09  Jakub Jelinek  

PR c++/98988
* constexpr.c (cxx_eval_binary_expression): Fold equality comparison
of address of heap artificial VAR_DECL against nullptr.

* g++.dg/cpp2a/constexpr-new16.C: New test.

--- gcc/cp/constexpr.c.jj   2021-02-02 09:52:10.535234056 +0100
+++ gcc/cp/constexpr.c  2021-02-08 17:24:28.774467744 +0100
@@ -3157,6 +3157,32 @@ cxx_eval_binary_expression (const conste
lhs = cplus_expand_constant (lhs);
else if (TREE_CODE (rhs) == PTRMEM_CST)
rhs = cplus_expand_constant (rhs);
+  if (r == NULL_TREE && POINTER_TYPE_P (TREE_TYPE (lhs)))
+   {
+ tree op0 = lhs;
+ tree op1 = rhs;
+ if (integer_zerop (op0))
+   std::swap (op0, op1);
+ /* With -fno-delete-null-pointer-checks, generic folders
+assume addresses of non-automatic VAR_DECLs could be NULL.
+For constexpr new, as the vars aren't really living in
+target memory, assume they will never be NULL.  */
+ if (integer_zerop (op1))
+   {
+ STRIP_NOPS (op0);
+ if (TREE_CODE (op0) == ADDR_EXPR)
+   {
+ op0 = get_base_address (TREE_OPERAND (op0, 0));
+ if (op0
+ && VAR_P (op0)
+ && (DECL_NAME (op0) == heap_identifier
+ || DECL_NAME (op0) == heap_uninit_identifier
+ || DECL_NAME (op0) == heap_vec_identifier
+ || DECL_NAME (op0) == heap_vec_uninit_identifier))
+   r = constant_boolean_node (!is_code_eq, type);
+   }
+   }
+   }
  }
if (code == POINTER_PLUS_EXPR && !*non_constant_p
&& integer_zerop (lhs) && !integer_zerop (rhs))
--- gcc/testsuite/g++.dg/cpp2a/constexpr-new16.C.jj 2021-02-08 
17:25:37.130699213 +0100
+++ gcc/testsuite/g++.dg/cpp2a/constexpr-new16.C2021-02-08 
17:26:49.175889198 +0100
@@ -0,0 +1,13 @@
+// PR c++/98988
+// { dg-do compile { target c++20 } }
+// { dg-options "-fno-delete-null-pointer-checks" }
+
+constexpr bool
+foo ()
+{
+  auto ptr = new int();
+  delete ptr;
+  return true;
+}
+
+static_assert (foo ());

Jakub

Re: [PATCH 2/5] Make sure that static data member constexpr isn't optimized away in test.

On Tue, Feb 09, 2021 at 02:40:12PM -0500, Jason Merrill wrote:
> For GCC 11, I think let's fix the regression with your (Jakub) earlier
> patch, maybe only for DIEs with DW_AT_const_value.

Thanks.
Following works too, so I'll test it tonight.

2021-02-09  Jakub Jelinek  

PR debug/98755
* dwarf2out.c (prune_unused_types_walk): Mark DW_TAG_variable DIEs
at class scope for DWARF5+.

--- gcc/dwarf2out.c.jj  2021-01-27 10:05:35.313942040 +0100
+++ gcc/dwarf2out.c 2021-02-09 20:51:42.907922550 +0100
@@ -29475,6 +29475,16 @@ prune_unused_types_walk (dw_die_ref die)
  if (die->die_perennial_p)
break;
 
+ /* For static data members, the declaration in the class is supposed
+to have DW_TAG_member tag in DWARF{3,4} but DW_TAG_variable in
+DWARF5.  DW_TAG_member will be marked, so mark even such
+DW_TAG_variables in DWARF5, as long as it has DW_AT_const_value
+attribute.  */
+ if (dwarf_version >= 5
+ && class_scope_p (die->die_parent)
+ && get_AT (die, DW_AT_const_value))
+   break;
+
  /* premark_used_variables marks external variables --- don't mark
 them here.  But function-local externals are always considered
 used.  */


Jakub

Re: [PATCH 2/5] Make sure that static data member constexpr isn't optimized away in test.


On 9/1/20 2:46 PM, Jason Merrill wrote:

On 8/25/20 5:19 AM, Mark Wielaard wrote:

On Mon, 2020-08-24 at 17:38 -0400, Jason Merrill wrote:

This looks incorrect to me, that is a workaround for a real GCC bug.

Shouldn't we instead do something like (untested) following patch?
I mean, for DWARF < 5 the static data members were using DW_TAG_member,
which has been always marked by the function, so IMHO we should also 
always

mark the DW_TAG_variables at the class scope that replaced those.


The earlier behavior seems like an accident, that happened because we
always need to emit information about non-static data members.  I don't
think we should take it as guidance.


Maybe the reason they got emitted this way was an accident on the GCC
side. But I don't think it is an accident on the GDB side. At least
looking at the various gdb testcases, the intention is to show a
struct/class type as defined to the user, which includes both the
static and non-static data members of a class.


That would make sense.


So, GDB prefers no pruning of members...


In this case one reason we don't emit debug info is because (before
C++17) there's no definition of 'b'.  After C++17 the in-class
declaration of 'b' is a definition, but we don't have to give it a
symbol, so there's still nothing for the debug info to describe.


But don't we describe other parts of a type that don't have a symbol as
part of the debug info?


It seems that currently we describe unused/undefined functions, but not 
unused nested types/typedefs.



On 8/25/20 8:41 AM, Jakub Jelinek wrote:


On Mon, Aug 24, 2020 at 05:38:28PM -0400, Jason Merrill via 
Gcc-patches wrote:


This issue doesn't seem specific to class members; it also affects
namespace-scope C++17 inline variables:

inline const int var = 42;
int main() { return var; }

Compiling this testcase with -g doesn't emit any debug info for 'var' 
even

though it's used.

Should we assume that if a variable with DW_AT_const_value is 
TREE_USED, we

need to write out debug info for it?


I guess it is a question of how exactly the (default on)
-feliminate-unused-debug-symbols should behave with different kind of
entities.

One thing are the non-inline static data members without const/constexpr or
without initializer in the class.  Those need a definition if they are ever
used, so it is probably fine to only emit them in the class in the TU where
they are defined.  But note that e.g. with -fdebug-types-section we still
force them to be output in class and do no pruning (and the pruning actually
makes dwz less efficient unless dwz is tought to not only merge the DIEs
with the same children and attributes, but also reconstruct more complete
DIEs out of several less complete ones for the same class).


Right, this gets at Mark's point above.  How much pruning do we want to 
do of class bodies?  We currently do some, but how much benefit does 
that actually give us?  Is it worth the cost?


...and our deduplication mechanisms prefer no pruning of members.


Another case is non-inline static const data member with initializer,
do we track non-odr uses e.g. during constant evaluation and if so, should
that result in the static data member appearing?  Because if the static
const data member with initializer is never odr used, it doesn't have to be
ever defined and so it might never appear in the debug info.

Another case are inline vars, shall we treat as being used just that they
were used in some constant expression, or do only odr uses count?


If the goal of debug info is to be able to evaluate the same expressions 
that appear in the source, constant uses need to count, too.  I wonder 
how we could associate the uses with their context so pruning works 
properly.


For GCC 11, I think let's fix the regression with your (Jakub) earlier 
patch, maybe only for DIEs with DW_AT_const_value.


For GCC 12, maybe we want to stop pruning any class members by default.

Jason

Fix PR rtl-optimization/96015

2021-02-09 Thread Eric Botcazou

This is the miscompilation of Python on HP-PA/Linux present on the mainline 
and 10 branch, caused by the presence of a call to __builtin_unreachable () in 
the middle of a heavily branchy code, which confuses the reorg pass.

Bootstrapped/regtested on SPARC/Solaris, applied on mainline and 10 branch.


2021-02-09  Eric Botcazou 

PR rtl-optimization/96015
* reorg.c (skip_consecutive_labels): Minor comment tweaks.
(relax_delay_slots): When deleting a jump to the next active
instruction over a barrier, first delete the barrier if the
jump is the only way to reach the target label.

-- 
Eric Botcazoudiff --git a/gcc/reorg.c b/gcc/reorg.c
index 84beb9395aa..d468fa698ba 100644
--- a/gcc/reorg.c
+++ b/gcc/reorg.c
@@ -139,9 +139,9 @@ skip_consecutive_labels (rtx label_or_return)
   /* __builtin_unreachable can create a CODE_LABEL followed by a BARRIER.
 
  Since reaching the CODE_LABEL is undefined behavior, we can return
- any code label and we're OK at runtime.
+ any code label and we're OK at run time.
 
- However, if we return a CODE_LABEL which leads to a shrinked wrapped
+ However, if we return a CODE_LABEL which leads to a shrink-wrapped
  epilogue, but the path does not have a prologue, then we will trip
  a sanity check in the dwarf2 cfi code which wants to verify that
  the CFIs are all the same on the traces leading to the epilogue.
@@ -3175,6 +3175,23 @@ relax_delay_slots (rtx_insn *first)
 	  && ! condjump_in_parallel_p (jump_insn)
 	  && ! (next && switch_text_sections_between_p (jump_insn, next)))
 	{
+	  rtx_insn *direct_label = as_a (JUMP_LABEL (insn));
+	  rtx_insn *prev = prev_nonnote_insn (direct_label);
+
+	  /* If the insn jumps over a BARRIER and is the only way to reach
+		 its target, then we need to delete the BARRIER before the jump
+		 because, otherwise, the target may end up being considered as
+		 unreachable and thus also deleted.  */
+	  if (BARRIER_P (prev) && LABEL_NUSES (direct_label) == 1)
+		{
+		  delete_related_insns (prev);
+
+		  /* We have just removed a BARRIER, which means that the block
+		 number of the next insns has effectively been changed (see
+		 find_basic_block in resource.c), so clear it.  */
+		  clear_hashed_info_until_next_barrier (direct_label);
+		}
+
 	  delete_jump (jump_insn);
 	  continue;
 	}

[OG10] [committed] Backport patches for non-rectangular loop collapse

2021-02-09 Thread Kwok Cheung Yeung


Hello

I have backported the following patches for supporting non-rectangular loop 
collapse from mainline to the devel/omp/gcc-10 branch:


7bfdb5a1c694cb9006e0478941e4443b230f5b98 openmp: Fix ICE on non-rectangular loop 
with known 0 iterations
88528328ea560230f728af97110e89396c8267d2 openmp: Improve composite triangular 
loop lowering and expansion
758fdf6514348a40ed424f3244cb25b92a005095 openmp: Add support for non-rectangular 
loops in taskloop construct
4b2c33ef32f483ca78f6c7b3a5ee880aebe75b4c openmp: Handle even some combined 
non-rectangular loops
3dd767906dab7d5456fc4c3a98134582b5f8a2ed openmp: Use more efficient logical -> 
actual computation even if # iterations is computed at runtime
1e507ef879b2bf4ec1c80a07f41c52474ed32ba3 openmp: Compute number of collapsed 
loop iterations more efficiently for some non-rectangular loops

c5f31b373a075254e954e6d690671f68955db6d4 openmp: Fix up loop-21.c
6da60f76333666e98955aa33624a7e95bffe58aa openmp: Adjust outer bounds of non-rect 
loops
8abe8a169150b717c8e1f7de8c1e0d29b9381806 openmp: Optimize triangular loop 
logical iterator to actual iterators computation using search for quadratic 
equation root(s)
551b4fbc89e84c43a9cd202bc537f428b39aab83 openmp: Diagnose non-rectangular loops 
with invalid steps
b2eabb179a3e2eab5eda2cc0829ba88756252189 openmp: Non-rectangular loop support 
for non-composite worksharing loops and distribute
076673fd7c64940b761ce980ea54df7ef6dd2199 openmp: Fix two pastos in non-rect loop 
OpenMP lowering.
875154e999bdf000f95e811cc50a1cbf76c0ce71 openmp: Compute triangular loop number 
of iterations at compile time
3cdeb3993f3901407dcb4320525876c188f31872 openmp: Initial part of OpenMP 5.0 
non-rectangular loop support


Kwok

[PATCH] c++: abbreviated function template return type rewriting [PR98990]

2021-02-09 Thread Patrick Palka via Gcc-patches

When an abbreviated function template has a complex placeholder return
type such auto& or auto**, the level adjustment performed by
splice_late_return_type directly replaces the 'auto' inside the original
return type with the level-adjusted 'auto', but that breaks
TYPE_CANONICAL caching.  Instead, we should rebuild the entire return
type using the adjusted 'auto'.

This patch makes this happen by tsubsting the original return type with
an argument vector that maps the original 'auto' to the adjusted 'auto'.
In passing, this patch also reverts the misguided changes to
find_type_usage in r10-6571 that made find_type_usage return a tree*
instead of a tree so as to discourage this kind of in-place type
modification.

It occurred to me that the constraint also needs to be rebuilt so that
it refers to the adjusted 'auto', but this oversight doesn't seem to
cause any issues at the moment due to how do_auto_deduction "manually"
substitutes the 'auto' inside the constraint before performing
satisfaction.  So this will instead be fixed as part of a broader rework
of placeholder type constraint checking.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk/10?

gcc/cp/ChangeLog:

PR c++/98990
* pt.c (splice_late_return_type): Rebuild the entire return type
if we have to adjust the level of the auto.  Use
TEMPLATE_TYPE_LEVEL for brevity.
(type_uses_auto): Adjust call to find_type_usage.
* type-utils.h (find_type_usage): Revert r10-6571 change that
made this function return a pointer to the auto node.

gcc/testsuite/ChangeLog:

PR c++/98990
* g++.dg/concepts/abbrev8.C: New test.
---
 gcc/cp/pt.c | 39 +
 gcc/cp/type-utils.h | 23 +++
 gcc/testsuite/g++.dg/concepts/abbrev8.C | 22 ++
 3 files changed, 53 insertions(+), 31 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/concepts/abbrev8.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index c5b0a9292db..46cd322fbf4 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -29601,22 +29601,25 @@ splice_late_return_type (tree type, tree 
late_return_type)
   return late_return_type;
 }
 
-  if (tree *auto_node = find_type_usage (, is_auto))
-{
-  tree idx = get_template_parm_index (*auto_node);
-  if (TEMPLATE_PARM_LEVEL (idx) <= processing_template_decl)
-   {
- /* In an abbreviated function template we didn't know we were dealing
-with a function template when we saw the auto return type, so 
update
-it to have the correct level.  */
- tree new_auto = make_auto_1 (TYPE_IDENTIFIER (*auto_node), false);
- PLACEHOLDER_TYPE_CONSTRAINTS (new_auto)
-   = PLACEHOLDER_TYPE_CONSTRAINTS (*auto_node);
- TYPE_CANONICAL (new_auto) = canonical_type_parameter (new_auto);
- new_auto = cp_build_qualified_type (new_auto, TYPE_QUALS 
(*auto_node));
- *auto_node = new_auto;
-   }
-}
+  if (tree auto_node = find_type_usage (type, is_auto))
+if (TEMPLATE_TYPE_LEVEL (auto_node) <= processing_template_decl)
+  {
+   /* In an abbreviated function template we didn't know we were dealing
+  with a function template when we saw the auto return type, so rebuild
+  the return type using an auto with the correct level.  */
+   tree new_auto = make_auto_1 (TYPE_IDENTIFIER (auto_node), false);
+   tree auto_vec = make_tree_vec (1);
+   TREE_VEC_ELT (auto_vec, 0) = new_auto;
+   tree targs = add_outermost_template_args (current_template_args (),
+ auto_vec);
+   /* FIXME: We should also rebuild the constraint to refer to the new
+  auto.  */
+   PLACEHOLDER_TYPE_CONSTRAINTS (new_auto)
+ = PLACEHOLDER_TYPE_CONSTRAINTS (auto_node);
+   TYPE_CANONICAL (new_auto) = canonical_type_parameter (new_auto);
+   new_auto = cp_build_qualified_type (new_auto, TYPE_QUALS (auto_node));
+   return tsubst (type, targs, tf_none, NULL_TREE);
+  }
   return type;
 }
 
@@ -29661,10 +29664,8 @@ type_uses_auto (tree type)
   else
return NULL_TREE;
 }
-  else if (tree *tp = find_type_usage (, is_auto))
-return *tp;
   else
-return NULL_TREE;
+return find_type_usage (type, is_auto);
 }
 
 /* Report ill-formed occurrences of auto types in ARGUMENTS.  If
diff --git a/gcc/cp/type-utils.h b/gcc/cp/type-utils.h
index 5551e8f5ef1..138fed6c51e 100644
--- a/gcc/cp/type-utils.h
+++ b/gcc/cp/type-utils.h
@@ -20,22 +20,21 @@ along with GCC; see the file COPYING3.  If not see
 #ifndef GCC_CP_TYPE_UTILS_H
 #define GCC_CP_TYPE_UTILS_H
 
-/* Returns a pointer to the first tree within *TP that is directly matched by
-   PRED.  *TP may be a type or PARM_DECL and is incrementally decomposed toward
-   its type-specifier until a match is found.  NULL is returned if PRED does 
not
-   match any part of

Re: [PATCH] avoid -Warray-bounds checks for vtable assignments (PR 98266)

2021-02-09 Thread Richard Earnshaw (lists) via Gcc-patches


On 2/8/21 4:11 PM, Jeff Law wrote:



On 2/8/21 3:44 PM, Martin Sebor wrote:

On 2/8/21 3:26 PM, Jeff Law wrote:



On 2/8/21 2:56 PM, Martin Sebor wrote:

On 2/8/21 12:59 PM, Jeff Law wrote:



On 1/19/21 5:56 PM, Martin Sebor via Gcc-patches wrote:

Similar to the problem reported for -Wstringop-overflow in pr98266
and already fixed, -Warray-bounds is also susceptible to false
positives in assignments and copies involving virtual inheritance.
Because the two warnings don't share code yet (hopefully in GCC 12)
the attached patch adds its own workaround for this problem to
gimple-array-bounds.cc, this one slightly more crude because of
the limited insight the array bounds checking has into the checked
expressions.

Tested on x86_64-linux.

Martin

gcc-98266.diff

PR middle-end/98266 - bogus array subscript is partly outside array
bounds on virtual inheritance

gcc/ChangeLog:

  PR middle-end/98266
  * gimple-array-bounds.cc
(array_bounds_checker::check_array_bounds):
  Avoid checking references involving artificial members.

gcc/testsuite/ChangeLog:

  PR middle-end/98266
  * g++.dg/warn/Warray-bounds-15.C: New test.

It seems to me that we've got the full statement at some point  and
thus
the full expression so at some point couldn't we detect when
TYPE_SIZE_UNIT!= DECL_SIZE_UNIT?  Or should we be using TYPE_SIZE_UNIT
rather than DECL_SIZE_UNIT in gimple-array-bounds.cc

Am I missing something?


The expression we're looking at when the false positive is issued
is the MEM_REF in the LHS of:

MEM[(struct D *) + 24B]._vptr.D =   [(void
*)&_ZTC1E24_1D + 24B];

TREE_TYPE(LHS) is D, DECL_SIZE_UNIT (D.2652) is 24, and
TYPE_SIZE_UNIT(D) is also 24, so there's no discrepancy between
DECL_SIZE and TYPE_SIZE.

So that seems like it's a different issue then, unrelated to 97595.
Right?


I think the underlying problem is the same.  We're getting a size
that doesn't correspond to what's actually being accessed, and it
happens because of the virtual inheritance.  In pr97595 Jason
suggested to use the decl/type size inequality to identify this
case but I think we could have just as well used DECL_ARTIFICIAL
instead.  At least the test cases from pr97595 both pass with
this change.

But in the 98266 case the type and decl sizes are the same.  So to be
true that would mean that the underlying type we're using to access
memory differs from its actual type.  Is that the case in the IL?  And
does this have wider implications for diagnostics or optimizations that
rely on accurate type sizing?

I'm just trying to make sure I understand, not accepting or rejecting
the patch yet.


The part of the IL with the MEM_REF is this:

void g ()
{
  void * D.2789;
  struct E D.2652;

   [local count: 1073741824]:
  E::E (, "");
  f ();

   [local count: 1073741824]:
  MEM[(struct D *) + 24B]._vptr.D =   [(void 
*)&_ZTC1E24_1D + 24B];

  ...

The access here is to the _vptr.D pointer member of D.2652 which is
just past the end of the parent object (as reflected by its SIZE):
it sets sets up the virtual table pointer.

The access in pr97595 is to the member subobject, which, as Jason
explained (and I accordingly documented under DECL_SIZE in tree.h),
is also laid out separately from the parent object.

These cases aren't exactly the same (which is also why the test
I added for -Warray-bounds in pr97595 didn't expose this bug) but
they are closely related.  The one here can be distinguished by
DECL_ARTIFICAL.  The other by the DECL_SIZE != TYPE_SIZE member
inequality.

Might this impact other warnings?  I'd say so if they don't take
these things into account.  I just learned about this in pr97595
which was a -Wstringop-overflow false positive but I also saw
a similar instance of -Warray-bounds with my patch to improve
caching and enhance array bounds checking.  I dealt with that
instance of the warning in that patch but proactively added
a test case to the fix for pr97595.  But the test case is focused
on the subobject access and not on one to the virtual table so
(as I said above) it didn't expose this bug.

Might this also impact optimizations?  I can imagine someone
unaware of this "gotcha" making the same "naive" assumption
I did, but I'd also expect such an invalid assumption to be
found either in code review or quickly cause problems in
testing.

Martin

Re: [PATCH v2] arm: Low overhead loop handle long range branches [PR98931]

On 09/02/2021 16:27, Andrea Corallo via Gcc-patches wrote:
> Jakub Jelinek  writes:
> 
>> On Tue, Feb 09, 2021 at 03:09:43PM +0100, Jakub Jelinek via Gcc-patches 
>> wrote:
"TARGET_32BIT && TARGET_HAVE_LOB"
 -  "le\t%|lr, %l0")
 +  "*
 +  if (get_attr_length (insn) == 4)
 +return \"le\\t%|lr, %l0\";
 +  else
 +return \"subs\\t%|lr, #1\;bne\\t%l0\";
 +  "
>>>
>>> Why not
>>> {
>>>   if (get_attr_length (insn) == 4)
>>> return "le\t%|lr, %l0";
>>>   else
>>> return "subs\t%|lr, #1;bne\t%l0";
>>> }
>>> instead?  Seems the arm backend uses "*..." more than the more modern {},
>>> but one needs to backslash prefix a lot which makes it less readable?
>>
>> Where "more modern" is introduced 19.5 years ago ;)
>>
>>  Jakub
> 
> I guess we really like traditions :)
> 
> Attached second version addressing this.
> 
> Thanks
> 
>   Andrea
> 

You're missing a clobber of the condition codes in the RTL.  This wasn't
needed before, but is now.

R.

Re: [PATCH] ICF: fix memory leak

On Tue, Feb 9, 2021 at 6:17 PM Martin Liška  wrote:
>
> The following fixes a memory leak.
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?

OK.

> Thanks,
> Martin
>
> gcc/ChangeLog:
>
> PR ipa/99003
> * ipa-icf.c (sem_item::add_reference): Fix memory leak when
> a reference exists.
> ---
>   gcc/ipa-icf.c | 7 +--
>   1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c
> index 435f567d72e..687ad8d45b7 100644
> --- a/gcc/ipa-icf.c
> +++ b/gcc/ipa-icf.c
> @@ -165,8 +165,11 @@ sem_item::add_reference (ref_map *refs,
> unsigned index = reference_count++;
> bool existed;
>
> -  vec 
> -= refs->get_or_insert (new sem_usage_pair (target, index), );
> +  sem_usage_pair *pair = new sem_usage_pair (target, index);
> +  vec  = refs->get_or_insert (pair, );
> +  if (existed)
> +delete pair;
> +
> v.safe_push (this);
> bitmap_set_bit (target->usage_index_bitmap, index);
> refs_set.add (target->node);
> --
> 2.30.0
>

Re: [PATCH] if-to-switch: fix a memory leak

On Tue, Feb 9, 2021 at 6:16 PM Martin Liška  wrote:
>
> The following fixes a memory leak.
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?

OK.

Richard.

> Thanks,
> Martin
>
> gcc/ChangeLog:
>
> PR tree-optimization/99002
> * gimple-if-to-switch.cc (find_conditions): Fix memory leak
> in the function.
> ---
>   gcc/gimple-if-to-switch.cc | 3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/gcc/gimple-if-to-switch.cc b/gcc/gimple-if-to-switch.cc
> index 1712fc4c8b3..f39662be3e6 100644
> --- a/gcc/gimple-if-to-switch.cc
> +++ b/gcc/gimple-if-to-switch.cc
> @@ -447,10 +447,9 @@ find_conditions (basic_block bb,
> info->record_phi_mapping (info->m_false_edge,
> >m_false_edge_phi_mapping);
> conditions_in_bbs->put (bb, info);
> +  return;
>   }
>
> -  return;
> -
>   exit:
> delete info;
>   }
> --
> 2.30.0
>

Re: PR 96391? Can we fix it for gcc11?

On Tue, 9 Feb 2021, Qing Zhao wrote:

> Richard,
> 
> Thank you for the reply.
> 
> Yes, I understand that without a working testing case to repeat the error, 
> it’s very hard to debug and fix the issue. 
> 
> However, providing a testing case for this bug is really challenging from our 
> side due to multiple reasons…
> 
> I will discuss with our building engineer to see what we can do. Or, I will 
> try to debug and fix this issue myself…

Note you can try reducing a proprietary testcase with tools like
cvise or creduce.  Does your case also happen in a mingw/windows
environment?

Richard.

> 
> Qing
> 
> > On Feb 9, 2021, at 2:18 AM, Richard Biener  wrote:
> > 
> > On Mon, 8 Feb 2021, Qing Zhao wrote:
> > 
> >> Hi, 
> >> 
> >> The bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96391 
> >>  
> >>  >> >
> >> 
> >> Bug 96391 - [10/11 Regression] internal compiler error: in 
> >> linemap_compare_locations, at libcpp/line-map.c:1359 
> >> 
> >> has been opened on 7/30/2020, and multiple users reported the same issue. 
> >> 
> >> For our important application, all the C++ modules failed with this bug 
> >> when we use gcc10 or gcc11. Then we have 
> >> To use icc to compile C++, and gcc to compile C, it’s very inconvenient. 
> >> 
> >> I have raised the priority of this bug to P2 on 10/09/2020,  hope it can 
> >> be fixed in gcc11. 
> >> 
> >> I see that Michael Cronenworth has attached a preprocessed file for the 
> >> reproducing purpose in comment 4. 
> >> 
> >> So, can we have the fix of the bug in gcc11?
> > 
> > The issue is that the preprocessed source does not reproduce the issue and
> > a mingw development environment is not easily accessible (to me at least).
> > 
> > So unless you can reproduce this in a standard linux environment and can
> > provide a testcase I don't see a way to get this bug forward.
> > 
> > Richard.
> > 
> >> Thanks a lot.
> >> 
> >> Qing
> > 
> > -- 
> > Richard Biener mailto:rguent...@suse.de>>
> > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

RE: [PATCH] testsuite: aarch64: Add tests for vpaddq intrinsics

2021-02-09 Thread Kyrylo Tkachov via Gcc-patches




> -Original Message-
> From: Jonathan Wright 
> Sent: 09 February 2021 17:08
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov 
> Subject: [PATCH] testsuite: aarch64: Add tests for vpaddq intrinsics
> 
> Hi,
> 
> As subject, this patch adds tests for vpaddq_* Neon intrinsics. Since these
> intrinsics are only supported for AArch64, these tests are restricted to
> only run on AArch64 targets.
> 
> (There are currently no tests covering these intrinsics.)
> 
> Ok for master?
> 

Ok. Adding tests to cover existing intrinsics is always welcome.
Thanks,
Kyrill

> Thanks,
> Jonathan
> 
> ---
> 
> gcc/testsuite/ChangeLog:
> 
> 2021-02-09  Jonathan Wright  
> 
> * gcc.target/aarch64/advsimd-intrinsics/vpXXXq.inc:
> New test template.
> * gcc.target/aarch64/advsimd-intrinsics/vpaddq.c: New test.

[PATCH] testsuite: aarch64: Add tests for vpaddq intrinsics

2021-02-09 Thread Jonathan Wright via Gcc-patches

Hi,

As subject, this patch adds tests for vpaddq_* Neon intrinsics. Since these
intrinsics are only supported for AArch64, these tests are restricted to
only run on AArch64 targets.

(There are currently no tests covering these intrinsics.)

Ok for master?

Thanks,
Jonathan

---

gcc/testsuite/ChangeLog:

2021-02-09  Jonathan Wright  

* gcc.target/aarch64/advsimd-intrinsics/vpXXXq.inc:
New test template.
* gcc.target/aarch64/advsimd-intrinsics/vpaddq.c: New test.
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vpXXXq.inc b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vpXXXq.inc
new file mode 100644
index ..3c27d32992c0e3a1d69580d1699c28f01fbb76ab
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vpXXXq.inc
@@ -0,0 +1,96 @@
+#define FNNAME1(NAME) exec_ ## NAME
+#define FNNAME(NAME) FNNAME1(NAME)
+
+void FNNAME (INSN_NAME) (void)
+{
+  /* Basic test: y=OP(x), then store the result.  */
+#define TEST_VPXXXQ1(INSN, T1, T2, W, N)\
+  VECT_VAR(vector_res, T1, W, N) =	\
+INSN##_##T2##W(VECT_VAR(vector, T1, W, N),\
+		   VECT_VAR(vector, T1, W, N));\
+  vst1q##_##T2##W(VECT_VAR(result, T1, W, N),\
+		  VECT_VAR(vector_res, T1, W, N))
+
+#define TEST_VPXXXQ(INSN, T1, T2, W, N)	\
+  TEST_VPXXXQ1(INSN, T1, T2, W, N)	\
+
+  DECL_VARIABLE(vector, int, 8, 16);
+  DECL_VARIABLE(vector, int, 16, 8);
+  DECL_VARIABLE(vector, int, 32, 4);
+  DECL_VARIABLE(vector, int, 64, 2);
+  DECL_VARIABLE(vector, uint, 8, 16);
+  DECL_VARIABLE(vector, uint, 16, 8);
+  DECL_VARIABLE(vector, uint, 32, 4);
+  DECL_VARIABLE(vector, uint, 64, 2);
+#if defined (__ARM_FEATURE_FP16_VECTOR_ARITHMETIC)
+  DECL_VARIABLE(vector, float, 16, 8);
+#endif
+  DECL_VARIABLE(vector, float, 32, 4);
+  DECL_VARIABLE(vector, float, 64, 2);
+
+  DECL_VARIABLE(vector_res, int, 8, 16);
+  DECL_VARIABLE(vector_res, int, 16, 8);
+  DECL_VARIABLE(vector_res, int, 32, 4);
+  DECL_VARIABLE(vector_res, int, 64, 2);
+  DECL_VARIABLE(vector_res, uint, 8, 16);
+  DECL_VARIABLE(vector_res, uint, 16, 8);
+  DECL_VARIABLE(vector_res, uint, 32, 4);
+  DECL_VARIABLE(vector_res, uint, 64, 2);
+#if defined (__ARM_FEATURE_FP16_VECTOR_ARITHMETIC)
+  DECL_VARIABLE(vector_res, float, 16, 8);
+#endif
+  DECL_VARIABLE(vector_res, float, 32, 4);
+  DECL_VARIABLE(vector_res, float, 64, 2);
+
+  clean_results ();
+
+  /* Initialize input "vector" from "buffer".  */
+  VLOAD(vector, buffer, q, int, s, 8, 16);
+  VLOAD(vector, buffer, q, int, s, 16, 8);
+  VLOAD(vector, buffer, q, int, s, 32, 4);
+  VLOAD(vector, buffer, q, int, s, 64, 2);
+  VLOAD(vector, buffer, q, uint, u, 8, 16);
+  VLOAD(vector, buffer, q, uint, u, 16, 8);
+  VLOAD(vector, buffer, q, uint, u, 32, 4);
+  VLOAD(vector, buffer, q, uint, u, 64, 2);
+#if defined (__ARM_FEATURE_FP16_VECTOR_ARITHMETIC)
+  VLOAD(vector, buffer, q, float, f, 16, 8);
+#endif
+  VLOAD(vector, buffer, q, float, f, 32, 4);
+  VLOAD(vector, buffer, q, float, f, 64, 2);
+
+  /* Apply a binary operator named INSN_NAME.  */
+  TEST_VPXXXQ(INSN_NAME, int, s, 8, 16);
+  TEST_VPXXXQ(INSN_NAME, int, s, 16, 8);
+  TEST_VPXXXQ(INSN_NAME, int, s, 32, 4);
+  TEST_VPXXXQ(INSN_NAME, int, s, 64, 2);
+  TEST_VPXXXQ(INSN_NAME, uint, u, 8, 16);
+  TEST_VPXXXQ(INSN_NAME, uint, u, 16, 8);
+  TEST_VPXXXQ(INSN_NAME, uint, u, 32, 4);
+  TEST_VPXXXQ(INSN_NAME, uint, u, 64, 2);
+#if defined (__ARM_FEATURE_FP16_VECTOR_ARITHMETIC)
+  TEST_VPXXXQ(INSN_NAME, float, f, 16, 8);
+#endif
+  TEST_VPXXXQ(INSN_NAME, float, f, 32, 4);
+  TEST_VPXXXQ(INSN_NAME, float, f, 64, 2);
+
+  CHECK(TEST_MSG, int, 8, 16, PRIx8, expected, "");
+  CHECK(TEST_MSG, int, 16, 8, PRIx16, expected, "");
+  CHECK(TEST_MSG, int, 32, 4, PRIx32, expected, "");
+  CHECK(TEST_MSG, int, 64, 2, PRIx64, expected, "");
+  CHECK(TEST_MSG, uint, 8, 16, PRIx8, expected, "");
+  CHECK(TEST_MSG, uint, 16, 8, PRIx16, expected, "");
+  CHECK(TEST_MSG, uint, 32, 4, PRIx32, expected, "");
+  CHECK(TEST_MSG, uint, 64, 2, PRIx64, expected, "");
+#if defined (__ARM_FEATURE_FP16_VECTOR_ARITHMETIC)
+  CHECK_FP(TEST_MSG, float, 16, 8, PRIx16, expected, "");
+#endif
+  CHECK_FP(TEST_MSG, float, 32, 4, PRIx32, expected, "");
+  CHECK_FP(TEST_MSG, float, 64, 2, PRIx64, expected, "");
+}
+
+int main (void)
+{
+  FNNAME (INSN_NAME) ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vpaddq.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vpaddq.c
new file mode 100644
index ..f15ada8aa52ae004389e014e4c45a5ebdddab291
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vpaddq.c
@@ -0,0 +1,40 @@
+/* { dg-skip-if "" { arm*-*-* } } */
+
+#include 
+#include "arm-neon-ref.h"
+#include "compute-ref-data.h"
+
+#define INSN_NAME vpaddq
+#define TEST_MSG "VPADDQ"
+
+/* Expected results.  */
+VECT_VAR_DECL(expected, int, 8, 16) [] = { 0xe1, 0xe5, 0xe9, 0xed,
+	   0xf1, 0xf5, 0xf9, 0xfd,
+

[committed] libstdc++: Clear up directories created by tests

libstdc++-v3/ChangeLog:

* testsuite/27_io/filesystem/operations/remove_all.cc: Remove
test directory after making it writable again.
* testsuite/experimental/filesystem/operations/remove_all.cc:
Likewise.

Tested x86_64-linux. Committed to trunk.

commit bfdb7b8c6f54807789471705a8fcd03bab3d7fc3
Author: Jonathan Wakely 
Date:   Tue Feb 9 16:53:56 2021

libstdc++: Clear up directories created by tests

libstdc++-v3/ChangeLog:

* testsuite/27_io/filesystem/operations/remove_all.cc: Remove
test directory after making it writable again.
* testsuite/experimental/filesystem/operations/remove_all.cc:
Likewise.

diff --git a/libstdc++-v3/testsuite/27_io/filesystem/operations/remove_all.cc 
b/libstdc++-v3/testsuite/27_io/filesystem/operations/remove_all.cc
index ee16d67debd..986a8dfb2ae 100644
--- a/libstdc++-v3/testsuite/27_io/filesystem/operations/remove_all.cc
+++ b/libstdc++-v3/testsuite/27_io/filesystem/operations/remove_all.cc
@@ -171,6 +171,8 @@ test04()
   }
 
   fs::permissions(dir, fs::perms::owner_write, fs::perm_options::add);
+  fs::remove_all(dir, ec);
+  f.path.clear();
 #endif
 }
 
diff --git 
a/libstdc++-v3/testsuite/experimental/filesystem/operations/remove_all.cc 
b/libstdc++-v3/testsuite/experimental/filesystem/operations/remove_all.cc
index 9aca6c8af67..87fd2dde41d 100644
--- a/libstdc++-v3/testsuite/experimental/filesystem/operations/remove_all.cc
+++ b/libstdc++-v3/testsuite/experimental/filesystem/operations/remove_all.cc
@@ -137,6 +137,8 @@ test04()
   }
 
   fs::permissions(dir, fs::perms::owner_write|fs::perms::add_perms);
+  fs::remove_all(dir, ec);
+  f.path.clear();
 #endif
 }

[aarch64][vect] Support V8QI->V8HI WIDEN_ patterns

2021-02-09 Thread Joel Hutton via Gcc-patches

Hi Richards,

This patch adds support for the V8QI->V8HI case from widening vect patterns as 
discussed to target PR98772.

Bootstrapped and regression tested on aarch64.


[aarch64][vect] Support V8QI->V8HI WIDEN_ patterns

In the case where 8 out of every 16 elements are widened using a
widening pattern and the next 8 are skipped the patterns are not
recognized. This is because they are normally used in a pair, such  as
VEC_WIDEN_MINUS_HI/LO, to achieve a v16qi->v16hi conversion for example.
This patch adds support for V8HI->V8QI patterns.

gcc/ChangeLog:
        PR tree-optimisation/98772
        * optabs-tree.c (supportable_convert_operation): Add case for V8QI->V8HI
        * tree-vect-stmts.c (vect_create_vectorized_promotion_stmts): New 
function to generate promotion stmts for V8QI->V8HI
        (vectorizable_conversion): Add case for V8QI->V8HI

gcc/testsuite/ChangeLog:
        PR tree-optimisation/98772
        * gcc.target/aarch64/pr98772.c: New test.diff --git a/gcc/optabs-tree.c b/gcc/optabs-tree.c
index c94073e3ed98f8c4cab65891f65dedebdb1ec274..b91ce3af6f0d4b3a62110bdb38f68ecc53765cad 100644
--- a/gcc/optabs-tree.c
+++ b/gcc/optabs-tree.c
@@ -308,6 +308,40 @@ supportable_convert_operation (enum tree_code code,
   if (!VECTOR_MODE_P (m1) || !VECTOR_MODE_P (m2))
 return false;
 
+  /* The case where a widening operation is not making use of the full width of
+ of the input vector, but using the full width of the output vector.
+ Return the non-wided code, which will be used after the inputs are
+ converted to the wide type.  */
+  if ((code == WIDEN_MINUS_EXPR
+  || code == WIDEN_PLUS_EXPR
+  || code == WIDEN_MULT_EXPR
+  || code == WIDEN_LSHIFT_EXPR)
+  && known_eq (TYPE_VECTOR_SUBPARTS (vectype_in),
+		  TYPE_VECTOR_SUBPARTS (vectype_out)))
+  {
+switch (code)
+{
+  case WIDEN_LSHIFT_EXPR:
+	*code1 = LSHIFT_EXPR;
+	return true;
+	break;
+  case WIDEN_MINUS_EXPR:
+	*code1 = MINUS_EXPR;
+	return true;
+	break;
+  case WIDEN_PLUS_EXPR:
+	*code1 = PLUS_EXPR;
+	return true;
+	break;
+  case WIDEN_MULT_EXPR:
+	*code1 = MULT_EXPR;
+	return true;
+	break;
+  default:
+	gcc_unreachable ();
+}
+  }
+
   /* First check if we can done conversion directly.  */
   if ((code == FIX_TRUNC_EXPR
&& can_fix_p (m1,m2,TYPE_UNSIGNED (vectype_out), )
diff --git a/gcc/testsuite/gcc.target/aarch64/pr98772.c b/gcc/testsuite/gcc.target/aarch64/pr98772.c
new file mode 100644
index ..35568a9f9d60c44aa01a6afc5f7e6a0935009aaf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr98772.c
@@ -0,0 +1,155 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -save-temps" } */
+#include 
+#include 
+
+#define DSIZE 16
+#define PIXSIZE 64
+
+extern void
+wplus (uint16_t *d, uint8_t *restrict pix1, uint8_t *restrict pix2 )
+{
+for( int y = 0; y < 4; y++ )
+{
+	for( int x = 0; x < 4; x++ )
+	d[x + y*4] = pix1[x] + pix2[x];
+	pix1 += 16;
+	pix2 += 16;
+}
+}
+extern void __attribute__((optimize (0)))
+wplus_no_opt (uint16_t *d, uint8_t *restrict pix1, uint8_t *restrict pix2 )
+{
+for( int y = 0; y < 4; y++ )
+{
+	for( int x = 0; x < 4; x++ )
+	d[x + y*4] = pix1[x] + pix2[x];
+	pix1 += 16;
+	pix2 += 16;
+}
+}
+
+extern void
+wminus (uint16_t *d, uint8_t *restrict pix1, uint8_t *restrict pix2 )
+{
+for( int y = 0; y < 4; y++ )
+{
+	for( int x = 0; x < 4; x++ )
+	d[x + y*4] = pix1[x] - pix2[x];
+	pix1 += 16;
+	pix2 += 16;
+}
+}
+extern void __attribute__((optimize (0)))
+wminus_no_opt (uint16_t *d, uint8_t *restrict pix1, uint8_t *restrict pix2 )
+{
+for( int y = 0; y < 4; y++ )
+{
+	for( int x = 0; x < 4; x++ )
+	d[x + y*4] = pix1[x] - pix2[x];
+	pix1 += 16;
+	pix2 += 16;
+}
+}
+
+extern void
+wmult (uint16_t *d, uint8_t *restrict pix1, uint8_t *restrict pix2 )
+{
+for( int y = 0; y < 4; y++ )
+{
+	for( int x = 0; x < 4; x++ )
+	d[x + y*4] = pix1[x] * pix2[x];
+	pix1 += 16;
+	pix2 += 16;
+}
+}
+extern void __attribute__((optimize (0)))
+wmult_no_opt (uint16_t *d, uint8_t *restrict pix1, uint8_t *restrict pix2 )
+{
+for( int y = 0; y < 4; y++ )
+{
+	for( int x = 0; x < 4; x++ )
+	d[x + y*4] = pix1[x] * pix2[x];
+	pix1 += 16;
+	pix2 += 16;
+}
+}
+
+extern void
+wlshift (uint16_t *d, uint8_t *restrict pix1)
+
+{
+for( int y = 0; y < 4; y++ )
+{
+	for( int x = 0; x < 4; x++ )
+	d[x + y*4] = pix1[x] << 8;
+	pix1 += 16;
+}
+}
+extern void __attribute__((optimize (0)))
+wlshift_no_opt (uint16_t *d, uint8_t *restrict pix1)
+
+{
+for( int y = 0; y < 4; y++ )
+{
+	for( int x = 0; x < 4; x++ )
+	d[x + y*4] = pix1[x] << 8;
+	pix1 += 16;
+}
+}
+
+void __attribute__((optimize (0)))
+init_arrays(uint16_t *d_a, uint16_t *d_b, uint8_t *pix1, uint8_t *pix2)
+{
+  for(int i = 0; i < DSIZE; i++)
+  {
+d_a[i] = (1074 * i)%17;
+d_b[i] = (1074 * i)%17;
+  }
+  for(int i = 0; i < PIXSIZE; i++)
+  {
+pix1[i] = (1024 *

Re: [PATCH v2] arm: Low overhead loop handle long range branches [PR98931]

2021-02-09 Thread Andrea Corallo via Gcc-patches

Jakub Jelinek  writes:

> On Tue, Feb 09, 2021 at 03:09:43PM +0100, Jakub Jelinek via Gcc-patches wrote:
>> >"TARGET_32BIT && TARGET_HAVE_LOB"
>> > -  "le\t%|lr, %l0")
>> > +  "*
>> > +  if (get_attr_length (insn) == 4)
>> > +return \"le\\t%|lr, %l0\";
>> > +  else
>> > +return \"subs\\t%|lr, #1\;bne\\t%l0\";
>> > +  "
>> 
>> Why not
>> {
>>   if (get_attr_length (insn) == 4)
>> return "le\t%|lr, %l0";
>>   else
>> return "subs\t%|lr, #1;bne\t%l0";
>> }
>> instead?  Seems the arm backend uses "*..." more than the more modern {},
>> but one needs to backslash prefix a lot which makes it less readable?
>
> Where "more modern" is introduced 19.5 years ago ;)
>
>   Jakub

I guess we really like traditions :)

Attached second version addressing this.

Thanks

  Andrea

>From 7434e9519929affe8bcb7f4df2a775612cba0b18 Mon Sep 17 00:00:00 2001
From: Andrea Corallo 
Date: Wed, 3 Feb 2021 15:21:54 +0100
Subject: [PATCH] arm: Low overhead loop handle long range branches [PR98931]

gcc/ChangeLog

2021-02-05  Andrea Corallo  

* config/arm/thumb2.md: Generate alternative sequence to
handle long range branches.

gcc/testsuite/Changelog

2021-02-08  Andrea Corallo  

 * gcc.target/arm/pr98931.c: New testcase.
---
 gcc/config/arm/thumb2.md   | 13 -
 gcc/testsuite/gcc.target/arm/pr98931.c | 17 +
 2 files changed, 29 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/pr98931.c

diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
index bd53bf320de..a032448654a 100644
--- a/gcc/config/arm/thumb2.md
+++ b/gcc/config/arm/thumb2.md
@@ -1719,7 +1719,18 @@
   (set (reg:SI LR_REGNUM)
(plus:SI (reg:SI LR_REGNUM) (const_int -1)))])]
   "TARGET_32BIT && TARGET_HAVE_LOB"
-  "le\t%|lr, %l0")
+  {
+if (get_attr_length (insn) == 4)
+  return "le\t%|lr, %l0";
+else
+  return "subs\t%|lr, #1;bne\t%l0";
+  }
+  [(set (attr "length")
+(if_then_else
+(lt (minus (pc) (match_dup 0)) (const_int 1024))
+   (const_int 4)
+   (const_int 6)))
+   (set_attr "type" "branch")])
 
 (define_expand "doloop_begin"
   [(match_operand 0 "" "")
diff --git a/gcc/testsuite/gcc.target/arm/pr98931.c 
b/gcc/testsuite/gcc.target/arm/pr98931.c
new file mode 100644
index 000..313876a3912
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr98931.c
@@ -0,0 +1,17 @@
+/* { dg-do assemble } */
+/* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-marm" 
"-mcpu=*" } } */
+/* { dg-options "-march=armv8.1-m.main -O3 
--param=max-completely-peeled-insns=1300 --save-temps" } */
+
+extern long long a[][20][26][26][22];
+
+void
+foo ()
+{
+  for (short d = 0; d + 1; d++)
+for (unsigned e = 0; e < 25; e += 4)
+  for (unsigned f = 0; f < 25; f += 4)
+for (int g = 0; g < 21; g += 4)
+  a[4][d][e][f][g] = 0;
+}
+
+/* { dg-final { scan-assembler-not {le\slr,\s\S*} } } */
-- 
2.20.1

[PATCH] ICF: fix memory leak

2021-02-09 Thread Martin Liška


The following fixes a memory leak.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

PR ipa/99003
* ipa-icf.c (sem_item::add_reference): Fix memory leak when
a reference exists.
---
 gcc/ipa-icf.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c
index 435f567d72e..687ad8d45b7 100644
--- a/gcc/ipa-icf.c
+++ b/gcc/ipa-icf.c
@@ -165,8 +165,11 @@ sem_item::add_reference (ref_map *refs,
   unsigned index = reference_count++;
   bool existed;
 
-  vec 

-= refs->get_or_insert (new sem_usage_pair (target, index), );
+  sem_usage_pair *pair = new sem_usage_pair (target, index);
+  vec  = refs->get_or_insert (pair, );
+  if (existed)
+delete pair;
+
   v.safe_push (this);
   bitmap_set_bit (target->usage_index_bitmap, index);
   refs_set.add (target->node);
--
2.30.0

[PATCH] if-to-switch: fix a memory leak

2021-02-09 Thread Martin Liška


The following fixes a memory leak.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

PR tree-optimization/99002
* gimple-if-to-switch.cc (find_conditions): Fix memory leak
in the function.
---
 gcc/gimple-if-to-switch.cc | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/gcc/gimple-if-to-switch.cc b/gcc/gimple-if-to-switch.cc
index 1712fc4c8b3..f39662be3e6 100644
--- a/gcc/gimple-if-to-switch.cc
+++ b/gcc/gimple-if-to-switch.cc
@@ -447,10 +447,9 @@ find_conditions (basic_block bb,
   info->record_phi_mapping (info->m_false_edge,
>m_false_edge_phi_mapping);
   conditions_in_bbs->put (bb, info);
+  return;
 }
 
-  return;

-
 exit:
   delete info;
 }
--
2.30.0

c++: Fix indirect partitions [PR 98944]

2021-02-09 Thread Nathan Sidwell




The most recent reimplementation of module loading initialization
changed the behaviour of setting an import's location, and broke some
partition handling.

PR c++/98944
gcc/cp/
* module.cc (module_state::is_rooted): Rename to ...
(module_state::has_location): ... here.  Adjust callers.
(module_state::read_partitions): Adjust validity check.
Don't overwrite a known location.
gcc/testsuite/
* g++.dg/modules/pr98944_a.C: New.
* g++.dg/modules/pr98944_b.C: New.
* g++.dg/modules/pr98944_c.C: New.
* g++.dg/modules/pr98944_d.C: New.


--
Nathan Sidwell
diff --git c/gcc/cp/module.cc w/gcc/cp/module.cc
index 41ce2011525..0749db8fe94 100644
--- c/gcc/cp/module.cc
+++ w/gcc/cp/module.cc
@@ -3608,8 +3608,8 @@ class GTY((chain_next ("%h.parent"), for_user)) module_state {
   }
 
  public:
-  /* Is this not a real module?  */
-  bool is_rooted () const
+  /* Is this a real module?  */
+  bool has_location () const
   {
 return loc != UNKNOWN_LOCATION;
   }
@@ -4416,7 +4416,7 @@ dumper::operator () (const char *format, ...)
 	const char *str = "(none)";
 	if (module_state *m = va_arg (args, module_state *))
 	  {
-		if (!m->is_rooted ())
+		if (!m->has_location ())
 		  str = "(detached)";
 		else
 		  str = m->get_flatname ();
@@ -14441,16 +14441,17 @@ module_state::read_partitions (unsigned count)
   dump () && dump ("Reading elided partition %s (crc=%x)", name, crc);
 
   module_state *imp = get_module (name);
-  if (!imp || !imp->is_partition () || imp->is_rooted ()
-	  || get_primary (imp) != this)
+  if (!imp	/* Partition should be ...  */
+	  || !imp->is_partition () /* a partition ...  */
+	  || imp->loadedness != ML_NONE  /* that is not yet loaded ...  */
+	  || get_primary (imp) != this) /* whose primary is this.  */
 	{
 	  sec.set_overrun ();
 	  break;
 	}
 
-  /* Attach the partition without loading it.  We'll have to load
-	 for real if it's indirectly imported.  */
-  imp->loc = floc;
+  if (!imp->has_location ())
+	imp->loc = floc;
   imp->crc = crc;
   if (!imp->filename && fname[0])
 	imp->filename = xstrdup (fname);
@@ -18857,7 +18858,7 @@ direct_import (module_state *import, cpp_reader *reader)
   timevar_start (TV_MODULE_IMPORT);
   unsigned n = dump.push (import);
 
-  gcc_checking_assert (import->is_direct () && import->is_rooted ());
+  gcc_checking_assert (import->is_direct () && import->has_location ());
   if (import->loadedness == ML_NONE)
 if (!import->do_import (reader, true))
   gcc_unreachable ();
@@ -18904,7 +18905,7 @@ import_module (module_state *import, location_t from_loc, bool exporting_p,
   linemap_module_reparent (line_table, import->loc, from_loc);
 }
   gcc_checking_assert (!import->module_p);
-  gcc_checking_assert (import->is_direct () && import->is_rooted ());
+  gcc_checking_assert (import->is_direct () && import->has_location ());
 
   direct_import (import, reader);
 }
@@ -18934,7 +18935,7 @@ declare_module (module_state *module, location_t from_loc, bool exporting_p,
 }
 
   gcc_checking_assert (module->module_p);
-  gcc_checking_assert (module->is_direct () && module->is_rooted ());
+  gcc_checking_assert (module->is_direct () && module->has_location ());
 
   /* Yer a module, 'arry.  */
   module_kind &= ~MK_GLOBAL;
diff --git c/gcc/testsuite/g++.dg/modules/pr98944_a.C w/gcc/testsuite/g++.dg/modules/pr98944_a.C
new file mode 100644
index 000..9475317dc82
--- /dev/null
+++ w/gcc/testsuite/g++.dg/modules/pr98944_a.C
@@ -0,0 +1,9 @@
+// PR 98944, the example in [module.unit]/4
+// { dg-additional-options -fmodules-ts }
+
+// tu3
+
+module A:Internals;
+// { dg-module-cmi A:Internals }
+
+int bar();
diff --git c/gcc/testsuite/g++.dg/modules/pr98944_b.C w/gcc/testsuite/g++.dg/modules/pr98944_b.C
new file mode 100644
index 000..209eafccc76
--- /dev/null
+++ w/gcc/testsuite/g++.dg/modules/pr98944_b.C
@@ -0,0 +1,8 @@
+// { dg-additional-options -fmodules-ts }
+
+// tu2
+export module A:Foo;
+// { dg-module-cmi A:Foo }
+
+import :Internals;
+export int foo() { return 2 * (bar() + 1); }
diff --git c/gcc/testsuite/g++.dg/modules/pr98944_c.C w/gcc/testsuite/g++.dg/modules/pr98944_c.C
new file mode 100644
index 000..90be60f2629
--- /dev/null
+++ w/gcc/testsuite/g++.dg/modules/pr98944_c.C
@@ -0,0 +1,8 @@
+// { dg-additional-options -fmodules-ts }
+
+// tu1
+export module A;
+// { dg-module-cmi A }
+
+export import :Foo;
+export int baz();
diff --git c/gcc/testsuite/g++.dg/modules/pr98944_d.C w/gcc/testsuite/g++.dg/modules/pr98944_d.C
new file mode 100644
index 000..25364ab9aae
--- /dev/null
+++ w/gcc/testsuite/g++.dg/modules/pr98944_d.C
@@ -0,0 +1,8 @@
+// { dg-additional-options -fmodules-ts }
+
+// tu4
+module A;
+
+import :Internals;
+int bar() { return baz() - 10; }
+int baz() { return 30; }

Re: [Patch] Fortran: %re/%im fixes for OpenMP/OpenACC + gfc_is_simplify_contiguous

On Tue, 9 Feb 2021 16:37:31 +0100
Thomas Schwinge  wrote:

> Hi!
> 
> On 2021-02-09T13:05:22+, Julian Brown 
> wrote:
> > On Tue, 9 Feb 2021 13:45:36 +0100
> > Tobias Burnus  wrote:
> >  
> >> On 09.02.21 12:58, Thomas Schwinge wrote:  
> >> >> Granted. The array(:)%re access might update too much, but
> >> >> that's not different to array with strides or with contiguous
> >> >> arrays sections which contain component reference (and more
> >> >> than one component).
> >> > (Is that indeed allowed to "update too much"?)
> >> 
> >> Yes - that's the general problem with strides or bit sets;
> >> copying only a subset – and doing so atomically – is not
> >> always possible or feasible.
> >> 
> >> *OpenACC* 3.1 has for "2.14.4  Update Directive" the restriction:
> >> 
> >> "Noncontiguous subarrays may appear. It is implementation-specific
> >>   whether noncontiguous regions are updated by using one transfer
> >>   for each contiguous subregion, or whether the noncontiguous data
> >>   is packed, transferred once, and unpacked, or whether one or more
> >>   larger subarrays (no larger than the smallest contiguous region
> >>   that contains the specified subarray) are updated."
> >> 
> >> For map, I saw that that's the case – but I think Julian's
> >> patch does not handle this correctly for:
> >> 
> >> type t
> >>integer :: i, j, k
> >> end type t
> >> type(t) :: A(100)
> >>... host(A(:)%j)  
> 
> So I understand the variants in the quoted OpenACC part as follows.
> However I don't claim that I necessarily got all the fine detail
> right at the language level (English as well as OpenACC)!  So, please
> verify.
> 
> "Using one transfer for each contiguous subregion":
> 
> for n in 1..100: transfer A(n)%j individually
> 
> "Noncontiguous data is packed, transferred once, and unpacked":
> 
> device:
>   integer :: tmp(100)
>   for n in 1..100: tmp(n) = A(n)%j
> transfer tmp
> host:
>   for n in 1..100: A(n)%j = tmp(n)
> 
> In this example here, I understand "subarrays (no larger than the
> smallest contiguous region that contains the specified subarray)" to
> again resolve to 'A(n)%j', so doesn't add other another variant.
> 
> This -- per my reading! -- would be different here:
> 
> type t
>integer :: i, j1, j2, k
> end type t
> type(t) :: A(100)
>... host(A(:)%j1, A(:)%j2)
> 
> ... where I understand this to mean that each 'A(n)%j1' plus 'A(:)%j2'
> may be transferred together: either "using one transfer for each
> contiguous subregion":

I think you're overthinking it. My reading is that "each contiguous
subregion" just means, e.g. each "j" in:

  !$acc update host(A(:)%j)

although your case with multiple updates to adjacent fields would be
possible too, I guess.

> for n in 1..100: transfer memory region of A(n)%j1..A(n)%j2
> individually
> 
> ..., or "packed, transferred once, and unpacked":
> 
> device:
>   integer :: tmp(2 * 100)
>   for n in 1..100:
> tmp(2 * n) = A(n)%j1
> tmp(2 * n + 1) = A(n)%j2
> transfer tmp
> host:
>   for n in 1..100:
> A(n)%j1 = tmp(2 * n)
> A(n)%j2 = tmp(2 * n + 1)

Yes, fine, but...

> I do however not read into this that the following would be valid:
> 
> >> I think instead of transferring A(1)%j to A(100)%j, it transfers
> >> all of A(:), i.e. also A(1)%i and A(100)%k :-(  
> 
> I don't think it's appropriate for an 'update' to alter anything else
> than the exact 'var's as specified.

I disagree here. I think the "one or more larger subarrays" clause is
meant to indicate that a single transfer covering all the data (with
gaps) is acceptable. In terms of implementation -- much of the time
that will likely be more efficient than either transferring lots of
tiny fragments, or copying via an intermediate contiguous buffer, so
your reading of the spec might be shooting ourselves in the foot
somewhat.

One could imagine a heuristic that chooses between one large ("gappy")
transfer and, say, copying via a temporary buffer though, using the
latter if say the density of data transferred is less than some
percentage of the full amount.

> > Yes it will -- but given that A(2)%i and A(99)%k (and all the
> > in-between values) can legitimately be transferred according to the
> > spec  
> 
> I don't read it that way, I'm afraid.  :-O
> 
> > how much
> > of a problem is that? In particular, are there situations where this
> > "over-updating" can lead to incorrect results in a conforming
> > program?  
> 
> In your reading indeed it wouldn't, because the user couldn't expect
> the following:
> 
> > Perhaps the question is, can a user legitimately expect the host and
> > offloaded versions of some given memory block to hold different
> > data, like maintaining different data in a cache than the storage
> > backing that cache? One use-case for that might be double buffering
> > a "single array" (i.e. the host and device versions of that array).
> > I don't

Re: PR 96391? Can we fix it for gcc11?

2021-02-09 Thread Qing Zhao via Gcc-patches

Richard,

Thank you for the reply.

Yes, I understand that without a working testing case to repeat the error, it’s 
very hard to debug and fix the issue. 

However, providing a testing case for this bug is really challenging from our 
side due to multiple reasons…

I will discuss with our building engineer to see what we can do. Or, I will try 
to debug and fix this issue myself…


Qing

> On Feb 9, 2021, at 2:18 AM, Richard Biener  wrote:
> 
> On Mon, 8 Feb 2021, Qing Zhao wrote:
> 
>> Hi, 
>> 
>> The bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96391 
>>  
>> > >
>> 
>> Bug 96391 - [10/11 Regression] internal compiler error: in 
>> linemap_compare_locations, at libcpp/line-map.c:1359 
>> 
>> has been opened on 7/30/2020, and multiple users reported the same issue. 
>> 
>> For our important application, all the C++ modules failed with this bug when 
>> we use gcc10 or gcc11. Then we have 
>> To use icc to compile C++, and gcc to compile C, it’s very inconvenient. 
>> 
>> I have raised the priority of this bug to P2 on 10/09/2020,  hope it can be 
>> fixed in gcc11. 
>> 
>> I see that Michael Cronenworth has attached a preprocessed file for the 
>> reproducing purpose in comment 4. 
>> 
>> So, can we have the fix of the bug in gcc11?
> 
> The issue is that the preprocessed source does not reproduce the issue and
> a mingw development environment is not easily accessible (to me at least).
> 
> So unless you can reproduce this in a standard linux environment and can
> provide a testcase I don't see a way to get this bug forward.
> 
> Richard.
> 
>> Thanks a lot.
>> 
>> Qing
> 
> -- 
> Richard Biener mailto:rguent...@suse.de>>
> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

RE: use -mfpu=neon for arm/simd/vmmla_1.c

2021-02-09 Thread Kyrylo Tkachov via Gcc-patches




> -Original Message-
> From: Gcc-patches  On Behalf Of
> Alexandre Oliva
> Sent: 05 January 2021 07:49
> To: gcc-patches@gcc.gnu.org
> Subject: use -mfpu=neon for arm/simd/vmmla_1.c
> 
> 
> On some of our arm targets, we get various -mfpu flags implicitly or
> explicitly passed to the compiler during test runs.  The target
> options pushed in arm_neon.h that affect vmmlaq_s32 set isa_bit_neon,
> but the caller doesn't have that bit set, so arm_can_inline_p rejects
> the attempt to inline it, and the test fails.
> 
> An explicit -mfpu=neon would address the compile problem, but cause
> the assembler to reject the generated code.
> 
> So this patch adds -mfpu=auto to the test, overriding any implicit
> flags with the fpu implied by the arch.
> 
> Regstrapped on x86_64-linux-gnu, also tested on x-arm-wrs-vxworks7r2.
> Ok to install?

Ok. Aren't there more tests that have this problem?
Thanks,
Kyrill

> 
> 
> for  gcc/testsuite/ChangeLog
> 
>   * gcc.target/arm/simd/vmmla_1.c: Pass -mfpu=auto.
> ---
>  gcc/testsuite/gcc.target/arm/simd/vmmla_1.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/testsuite/gcc.target/arm/simd/vmmla_1.c
> b/gcc/testsuite/gcc.target/arm/simd/vmmla_1.c
> index aeb4a359e3351..d33ebf361d436 100644
> --- a/gcc/testsuite/gcc.target/arm/simd/vmmla_1.c
> +++ b/gcc/testsuite/gcc.target/arm/simd/vmmla_1.c
> @@ -1,6 +1,6 @@
>  /* { dg-do assemble } */
>  /* { dg-require-effective-target arm_v8_2a_i8mm_ok } */
> -/* { dg-options "-save-temps -O2 -march=armv8.2-a+i8mm -mfloat-
> abi=hard" } */
> +/* { dg-options "-save-temps -O2 -march=armv8.2-a+i8mm -mfpu=auto -
> mfloat-abi=hard" } */
> 
>  #include "arm_neon.h"
> 
> 
> --
> Alexandre Oliva, happy hacker  https://FSFLA.org/blogs/lxo/
>Free Software Activist GNU Toolchain Engineer
> Vim, Vi, Voltei pro Emacs -- GNUlius Caesar

Re: [PATCH] adjust "partly out of bounds" warning (PR 98503)


On 2/9/21 12:41 AM, Richard Biener wrote:

On Tue, Feb 9, 2021 at 1:04 AM Martin Sebor via Gcc-patches
 wrote:


On 2/8/21 12:09 PM, Jeff Law wrote:



On 2/3/21 3:45 PM, Martin Sebor wrote:

On 2/3/21 2:57 PM, Jeff Law wrote:



On 1/28/21 4:03 PM, Martin Sebor wrote:

The GCC 11 -Warray-bounds enhancement to diagnose accesses whose
leading offset is in bounds but whose trailing offset is not has
been causing some confusion.  When the warning is issued for
an access to an in-bounds member via a pointer to a struct that's
larger than the pointed-to object, phrasing this strictly invalid
access in terms of array subscripts can be misleading, especially
when the source code doesn't involve any arrays or indexing.

Since the problem boils down to an aliasing violation much more
so than an actual out-of-bounds access, the attached patch adjusts
the code to issue a -Wstrict-aliasing warning in these cases instead
of -Warray-bounds.  In addition, since the aliasing assumptions in
GCC can be disabled by -fno-strict-aliasing, the patch also makes
these instances of the warning conditional on -fstrict-aliasing
being in effect.

Martin

gcc-98503.diff

PR middle-end/98503 -Warray-bounds when -Wstrict-aliasing would be
more appropriate

gcc/ChangeLog:

  PR middle-end/98503
  * gimple-array-bounds.cc (array_bounds_checker::check_mem_ref):
  Issue -Wstrict-aliasing for a subset of violations.
  (array_bounds_checker::check_array_bounds):  Set new member.
  * gimple-array-bounds.h (array_bounds_checker::cref_of_mref): New
  data member.

gcc/testsuite/ChangeLog:

  PR middle-end/98503
  * g++.dg/warn/Warray-bounds-10.C: Adjust text of expected warnings.
  * g++.dg/warn/Warray-bounds-11.C: Same.
  * g++.dg/warn/Warray-bounds-12.C: Same.
  * g++.dg/warn/Warray-bounds-13.C: Same.
  * gcc.dg/Warray-bounds-63.c: Avoid -Wstrict-aliasing.  Adjust text
  of expected warnings.
  * gcc.dg/Warray-bounds-66.c: Adjust text of expected warnings.
  * gcc.dg/Wstrict-aliasing-2.c: New test.
  * gcc.dg/Wstrict-aliasing-3.c: New test.

What I don't like here is the strict-aliasing warnings inside the file
and analysis phase for array bounds checking.

ISTM that catching this at cast time would be better.  So perhaps in
build_c_cast and the C++ equivalent?

It would mean we're warning at the cast site rather than the
dereference, but that may ultimately be better for the warning anyway.
I'm not sure.


I had actually experimented with a this (in the middle end, and only
for accesses) but even that caused so many warnings that I abandoned
it, at least for now.  -Warray-bounds has the benefit of flow analysis
(and dead code elimination).  In the front end it would have neither
and be both excessively noisy and ineffective at the same time.  For
example:

I think we agree that this really is an aliasing issue and I would go
further and claim that it has nothing to do with array bounds checking.
Therefore warning for it within gimple-array-bounds just seems wrong.

I realize that you like having DCE run and the ability to look at
offsets and such, but it really feels like the wrong place to do this.
Aliasing issues are also more of a front-end thing since the language
defines what is and what is not valid aliasing -- one might even argue
that any aliasing warning needs to be identified by the front-ends
exclusively (though possibly pruned away later).


The aliasing restrictions are on accesses, which are [defined in
C as] execution-time actions on objects.  Analyzing execution-time
actions unavoidably depends on flow analysis which the front ends
have only very limited support for (simple expressions only).
I gave one example showing how the current -Wstrict-aliasing in
the front end is ineffective against all but the most simplistic
bugs, but there are many more.  For instance:

int i;
void *p = // valid
float *q = p;// valid
*q = 0;  // aliasing violation

This bug is easily detectable in the middle end but impossible
to do in the front end (same as all other invalid accesses).


But the code is valid in GIMPLE which allows to re-use the 'int i' storage
with storing using a different type.


Presumably you're referring to using placement new?  The warning
would have to be aware of it and either run before placement new
is inlined.  Alternatively, the inlining could add some annotation
into the IL that the warning would then use to differentiate valid
code from invalid.

Likewise if there are other such constructs (are there?) they would
need be marked up somehow by the front end.

I speculate that's what Jeff was suggesting by having the FE mark
up the code.

Martin




Whether this is done in gimple-array-bounds or some other pass seems
to me like a minor implementation detail.  It naturally came out of
an enhancement I implemented there (which would detect the above
with float replaced by any larger type as an out-of-bounds access)
but I

Re: [Patch] Fortran: %re/%im fixes for OpenMP/OpenACC + gfc_is_simplify_contiguous

2021-02-09 Thread Thomas Schwinge

Hi!

On 2021-02-09T13:05:22+, Julian Brown  wrote:
> On Tue, 9 Feb 2021 13:45:36 +0100
> Tobias Burnus  wrote:
>
>> On 09.02.21 12:58, Thomas Schwinge wrote:
>> >> Granted. The array(:)%re access might update too much, but that's
>> >> not different to array with strides or with contiguous arrays
>> >> sections which contain component reference (and more than one
>> >> component).
>> > (Is that indeed allowed to "update too much"?)
>>
>> Yes - that's the general problem with strides or bit sets;
>> copying only a subset – and doing so atomically – is not
>> always possible or feasible.
>>
>> *OpenACC* 3.1 has for "2.14.4  Update Directive" the restriction:
>>
>> "Noncontiguous subarrays may appear. It is implementation-specific
>>   whether noncontiguous regions are updated by using one transfer
>>   for each contiguous subregion, or whether the noncontiguous data
>>   is packed, transferred once, and unpacked, or whether one or more
>>   larger subarrays (no larger than the smallest contiguous region
>>   that contains the specified subarray) are updated."
>>
>> For map, I saw that that's the case – but I think Julian's
>> patch does not handle this correctly for:
>>
>> type t
>>integer :: i, j, k
>> end type t
>> type(t) :: A(100)
>>... host(A(:)%j)

So I understand the variants in the quoted OpenACC part as follows.
However I don't claim that I necessarily got all the fine detail right at
the language level (English as well as OpenACC)!  So, please verify.

"Using one transfer for each contiguous subregion":

for n in 1..100: transfer A(n)%j individually

"Noncontiguous data is packed, transferred once, and unpacked":

device:
  integer :: tmp(100)
  for n in 1..100: tmp(n) = A(n)%j
transfer tmp
host:
  for n in 1..100: A(n)%j = tmp(n)

In this example here, I understand "subarrays (no larger than the
smallest contiguous region that contains the specified subarray)" to
again resolve to 'A(n)%j', so doesn't add other another variant.

This -- per my reading! -- would be different here:

type t
   integer :: i, j1, j2, k
end type t
type(t) :: A(100)
   ... host(A(:)%j1, A(:)%j2)

... where I understand this to mean that each 'A(n)%j1' plus 'A(:)%j2'
may be transferred together: either "using one transfer for each
contiguous subregion":

for n in 1..100: transfer memory region of A(n)%j1..A(n)%j2 individually

..., or "packed, transferred once, and unpacked":

device:
  integer :: tmp(2 * 100)
  for n in 1..100:
tmp(2 * n) = A(n)%j1
tmp(2 * n + 1) = A(n)%j2
transfer tmp
host:
  for n in 1..100:
A(n)%j1 = tmp(2 * n)
A(n)%j2 = tmp(2 * n + 1)

I do however not read into this that the following would be valid:

>> I think instead of transferring A(1)%j to A(100)%j, it transfers
>> all of A(:), i.e. also A(1)%i and A(100)%k :-(

I don't think it's appropriate for an 'update' to alter anything else
than the exact 'var's as specified.

> Yes it will -- but given that A(2)%i and A(99)%k (and all the in-between
> values) can legitimately be transferred according to the spec

I don't read it that way, I'm afraid.  :-O

> how much
> of a problem is that? In particular, are there situations where this
> "over-updating" can lead to incorrect results in a conforming program?

In your reading indeed it wouldn't, because the user couldn't expect the
following:

> Perhaps the question is, can a user legitimately expect the host and
> offloaded versions of some given memory block to hold different data,
> like maintaining different data in a cache than the storage backing
> that cache? One use-case for that might be double buffering a "single
> array" (i.e. the host and device versions of that array). I don't think
> that's something we'd want to encourage, though.

I find the wording in the spec rather explicitly, for example: OpenACC
3.1, 2.7 "Data Clauses":

| In all cases, the compiler will allocate and manage a copy of the 'var'
| in the memory of the current device, creating a *visible device copy*
| of that 'var', for data not in shared memory.

Emphasis mine; and I indeed understand this to mean that the user can
"legitimately expect the host and offloaded versions of some given memory
block to hold different data, [...]" (your words from above).

> I think, rather, that partial updates are an optimisation the user can
> use when they know that only part of an array has been updated, so
> slight over-copying is harmless.

Interesting -- and, again: that makes sense in your reading.

So.  Should everybody work through this again, trying to reach consensus?
Do we need to clarify that with OpenACC?

As for OpenMP, Tobias stated:

|On 2021-02-09T13:45:36+0100, Tobias Burnus  wrote:
|> For OpenMP and map, I recall encountering a code which did do
|> this for OpenMP (i.e. contiguous subsection). I think it was
|> related to derived-type 'map', but I do not recall anymore.
|>
|>
|> Looking at the

Re: [PATCH] x86: Always save and restore shadow stack pointer

2021-02-09 Thread H.J. Lu via Gcc-patches

On Tue, Feb 9, 2021 at 6:33 AM Jakub Jelinek  wrote:
>
> On Tue, Feb 09, 2021 at 06:25:10AM -0800, H.J. Lu via Gcc-patches wrote:
> > My patch does nothing for GCC 10.
> >
> > > files from GCC 11
> > > and -fcf-protection=none (aka the long-year default)?
> > >
> >
> > Yes, if __builtin_setjmp and __builtin_longjmp aren't used.
> > No, if they are.
>
> Therefore ABI change.
>
> Also, is RDSPP really a nop even on say -march=i386/-march=i486?
> Or was it only ENDBR32 that is problematic for the very old CPUs?
>

Yes, my patch isn't a good idea.  It is better to detect such conditions at
link-time.

-- 
H.J.

Re: [PATCH] x86: Always save and restore shadow stack pointer

On Tue, Feb 09, 2021 at 06:25:10AM -0800, H.J. Lu via Gcc-patches wrote:
> My patch does nothing for GCC 10.
> 
> > files from GCC 11
> > and -fcf-protection=none (aka the long-year default)?
> >
> 
> Yes, if __builtin_setjmp and __builtin_longjmp aren't used.
> No, if they are.

Therefore ABI change.

Also, is RDSPP really a nop even on say -march=i386/-march=i486?
Or was it only ENDBR32 that is problematic for the very old CPUs?

Jakub

Re: [PATCH] x86: Always save and restore shadow stack pointer

2021-02-09 Thread H.J. Lu via Gcc-patches

On Tue, Feb 9, 2021 at 6:19 AM Richard Biener
 wrote:
>
> On Tue, Feb 9, 2021 at 2:11 PM H.J. Lu  wrote:
> >
> > On Tue, Feb 9, 2021 at 12:59 AM Richard Biener
> >  wrote:
> > >
> > > On Mon, Feb 8, 2021 at 3:07 PM H.J. Lu via Gcc-patches
> > >  wrote:
> > > >
> > > > When the SHSTK feature is not available or not enabled, RDSSP is a NOP,
> > > > always save and restore shadow stack pointer to support compiling source
> > > > codes, containing __builtin_setjmp and __builtin_longjmp, with different
> > > > -fcf-protection options.
> > >
> > > Is that an ABI change for -fno-cf-protection?
> > >
> >
> > Currently you can't mix object files with __builtin_setjmp and 
> > __builtin_longjmp
> > compiled with -fcf-protection and -fcf-protection=none.  This patch fixes 
> > it for
> > GCC 11 and newer.  It does nothing for object files compiled by older 
> > versions
> > of GCC.
>
> So object files compiled with GCC 10 still inter-operate with object

My patch does nothing for GCC 10.

> files from GCC 11
> and -fcf-protection=none (aka the long-year default)?
>

Yes, if __builtin_setjmp and __builtin_longjmp aren't used.
No, if they are.

-- 
H.J.

Re: [PATCH] x86: Always save and restore shadow stack pointer

On Tue, Feb 9, 2021 at 2:11 PM H.J. Lu  wrote:
>
> On Tue, Feb 9, 2021 at 12:59 AM Richard Biener
>  wrote:
> >
> > On Mon, Feb 8, 2021 at 3:07 PM H.J. Lu via Gcc-patches
> >  wrote:
> > >
> > > When the SHSTK feature is not available or not enabled, RDSSP is a NOP,
> > > always save and restore shadow stack pointer to support compiling source
> > > codes, containing __builtin_setjmp and __builtin_longjmp, with different
> > > -fcf-protection options.
> >
> > Is that an ABI change for -fno-cf-protection?
> >
>
> Currently you can't mix object files with __builtin_setjmp and 
> __builtin_longjmp
> compiled with -fcf-protection and -fcf-protection=none.  This patch fixes it 
> for
> GCC 11 and newer.  It does nothing for object files compiled by older versions
> of GCC.

So object files compiled with GCC 10 still inter-operate with object
files from GCC 11
and -fcf-protection=none (aka the long-year default)?

Richard.

> --
> H.J.

Re: [PATCH] arm: Low overhead loop handle long range branches [PR98931]

On Tue, Feb 09, 2021 at 03:09:43PM +0100, Jakub Jelinek via Gcc-patches wrote:
> >"TARGET_32BIT && TARGET_HAVE_LOB"
> > -  "le\t%|lr, %l0")
> > +  "*
> > +  if (get_attr_length (insn) == 4)
> > +return \"le\\t%|lr, %l0\";
> > +  else
> > +return \"subs\\t%|lr, #1\;bne\\t%l0\";
> > +  "
> 
> Why not
> {
>   if (get_attr_length (insn) == 4)
> return "le\t%|lr, %l0";
>   else
> return "subs\t%|lr, #1;bne\t%l0";
> }
> instead?  Seems the arm backend uses "*..." more than the more modern {},
> but one needs to backslash prefix a lot which makes it less readable?

Where "more modern" is introduced 19.5 years ago ;)

Jakub

Re: [PATCH] arm: Low overhead loop handle long range branches [PR98931]

On Tue, Feb 09, 2021 at 02:38:44PM +0100, Andrea Corallo via Gcc-patches wrote:
> >From c8216ed1313d670e79b28141dadd644e698c83cf Mon Sep 17 00:00:00 2001
> From: Andrea Corallo 
> Date: Wed, 3 Feb 2021 15:21:54 +0100
> Subject: [PATCH] arm: Low overhead loop handle long range branches [PR98931]
> 
> gcc/ChangeLog
> 
> 2021-02-05  Andrea Corallo  
> 
> * config/arm/thumb2.md: Generate alternative sequence to
>   handle long range branches.
> 
> gcc/testsuite/Changelog
> 
> 2021-02-08  Andrea Corallo  
> 
>  * gcc.target/arm/pr98931.c: New testcase.
> ---
>  gcc/config/arm/thumb2.md   | 13 -
>  gcc/testsuite/gcc.target/arm/pr98931.c | 17 +
>  2 files changed, 29 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/arm/pr98931.c
> 
> diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
> index bd53bf320de..2646926d3c1 100644
> --- a/gcc/config/arm/thumb2.md
> +++ b/gcc/config/arm/thumb2.md
> @@ -1719,7 +1719,18 @@
>(set (reg:SI LR_REGNUM)
> (plus:SI (reg:SI LR_REGNUM) (const_int -1)))])]
>"TARGET_32BIT && TARGET_HAVE_LOB"
> -  "le\t%|lr, %l0")
> +  "*
> +  if (get_attr_length (insn) == 4)
> +return \"le\\t%|lr, %l0\";
> +  else
> +return \"subs\\t%|lr, #1\;bne\\t%l0\";
> +  "

Why not
{
  if (get_attr_length (insn) == 4)
return "le\t%|lr, %l0";
  else
return "subs\t%|lr, #1;bne\t%l0";
}
instead?  Seems the arm backend uses "*..." more than the more modern {},
but one needs to backslash prefix a lot which makes it less readable?

Jakub

[PATCH] arm: Low overhead loop handle long range branches [PR98931]

2021-02-09 Thread Andrea Corallo via Gcc-patches

Hi all,

this is to fix PR98931 where the LE (loop end, Armv8.1-M low overhead
loops) instruction cannot cover sufficently long branches.

In this case we emit as an alternative:

subslr, #1
bne label

arm-none-eabi regtested, arm-none-linux-gnueabihf boostrapped.

Okay for trunk?

Regards

  Andrea
  
>From c8216ed1313d670e79b28141dadd644e698c83cf Mon Sep 17 00:00:00 2001
From: Andrea Corallo 
Date: Wed, 3 Feb 2021 15:21:54 +0100
Subject: [PATCH] arm: Low overhead loop handle long range branches [PR98931]

gcc/ChangeLog

2021-02-05  Andrea Corallo  

* config/arm/thumb2.md: Generate alternative sequence to
handle long range branches.

gcc/testsuite/Changelog

2021-02-08  Andrea Corallo  

 * gcc.target/arm/pr98931.c: New testcase.
---
 gcc/config/arm/thumb2.md   | 13 -
 gcc/testsuite/gcc.target/arm/pr98931.c | 17 +
 2 files changed, 29 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/pr98931.c

diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
index bd53bf320de..2646926d3c1 100644
--- a/gcc/config/arm/thumb2.md
+++ b/gcc/config/arm/thumb2.md
@@ -1719,7 +1719,18 @@
   (set (reg:SI LR_REGNUM)
(plus:SI (reg:SI LR_REGNUM) (const_int -1)))])]
   "TARGET_32BIT && TARGET_HAVE_LOB"
-  "le\t%|lr, %l0")
+  "*
+  if (get_attr_length (insn) == 4)
+return \"le\\t%|lr, %l0\";
+  else
+return \"subs\\t%|lr, #1\;bne\\t%l0\";
+  "
+  [(set (attr "length")
+(if_then_else
+(lt (minus (pc) (match_dup 0)) (const_int 1024))
+   (const_int 4)
+   (const_int 6)))
+   (set_attr "type" "branch")])
 
 (define_expand "doloop_begin"
   [(match_operand 0 "" "")
diff --git a/gcc/testsuite/gcc.target/arm/pr98931.c 
b/gcc/testsuite/gcc.target/arm/pr98931.c
new file mode 100644
index 000..313876a3912
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr98931.c
@@ -0,0 +1,17 @@
+/* { dg-do assemble } */
+/* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-marm" 
"-mcpu=*" } } */
+/* { dg-options "-march=armv8.1-m.main -O3 
--param=max-completely-peeled-insns=1300 --save-temps" } */
+
+extern long long a[][20][26][26][22];
+
+void
+foo ()
+{
+  for (short d = 0; d + 1; d++)
+for (unsigned e = 0; e < 25; e += 4)
+  for (unsigned f = 0; f < 25; f += 4)
+for (int g = 0; g < 21; g += 4)
+  a[4][d][e][f][g] = 0;
+}
+
+/* { dg-final { scan-assembler-not {le\slr,\s\S*} } } */
-- 
2.20.1

Re: [PATCH] x86: Always save and restore shadow stack pointer

2021-02-09 Thread H.J. Lu via Gcc-patches

On Tue, Feb 9, 2021 at 12:59 AM Richard Biener
 wrote:
>
> On Mon, Feb 8, 2021 at 3:07 PM H.J. Lu via Gcc-patches
>  wrote:
> >
> > When the SHSTK feature is not available or not enabled, RDSSP is a NOP,
> > always save and restore shadow stack pointer to support compiling source
> > codes, containing __builtin_setjmp and __builtin_longjmp, with different
> > -fcf-protection options.
>
> Is that an ABI change for -fno-cf-protection?
>

Currently you can't mix object files with __builtin_setjmp and __builtin_longjmp
compiled with -fcf-protection and -fcf-protection=none.  This patch fixes it for
GCC 11 and newer.  It does nothing for object files compiled by older versions
of GCC.

-- 
H.J.

Re: [Patch] Fortran: %re/%im fixes for OpenMP/OpenACC + gfc_is_simplify_contiguous

On Tue, 9 Feb 2021 13:45:36 +0100
Tobias Burnus  wrote:

> On 09.02.21 12:58, Thomas Schwinge wrote:
> >> Granted. The array(:)%re access might update too much, but that's
> >> not different to array with strides or with contiguous arrays
> >> sections which contain component reference (and more than one
> >> component).  
> > (Is that indeed allowed to "update too much"?)  
> 
> Yes - that's the general problem with strides or bit sets;
> copying only a subset – and doing so atomically – is not
> always possible or feasible.
> 
> *OpenACC* 3.1 has for "2.14.4  Update Directive" the restriction:
> 
> "Noncontiguous subarrays may appear. It is implementation-specific
>   whether noncontiguous regions are updated by using one transfer
>   for each contiguous subregion, or whether the noncontiguous data
>   is packed, transferred once, and unpacked, or whether one or more
>   larger subarrays (no larger than the smallest contiguous region
>   that contains the specified subarray) are updated."
> 
> For map, I saw that that's the case – but I think Julian's
> patch does not handle this correctly for:
> 
> type t
>integer :: i, j, k
> end type t
> type(t) :: A(100)
>... host(A(:)%j)
> 
> I think instead of transferring A(1)%j to A(100)%j, it transfers
> all of A(:), i.e. also A(1)%i and A(100)%k :-(
> 
> ^– Julian?

Yes it will -- but given that A(2)%i and A(99)%k (and all the in-between
values) can legitimately be transferred according to the spec, how much
of a problem is that? In particular, are there situations where this
"over-updating" can lead to incorrect results in a conforming program?

Perhaps the question is, can a user legitimately expect the host and
offloaded versions of some given memory block to hold different data,
like maintaining different data in a cache than the storage backing
that cache? One use-case for that might be double buffering a "single
array" (i.e. the host and device versions of that array). I don't think
that's something we'd want to encourage, though.

I think, rather, that partial updates are an optimisation the user can
use when they know that only part of an array has been updated, so
slight over-copying is harmless.

Thanks,

Julian

Re: [Patch] Fortran: %re/%im fixes for OpenMP/OpenACC + gfc_is_simplify_contiguous


On 09.02.21 12:58, Thomas Schwinge wrote:

Granted. The array(:)%re access might update too much, but that's not
different to array with strides or with contiguous arrays sections
which contain component reference (and more than one component).

(Is that indeed allowed to "update too much"?)


Yes - that's the general problem with strides or bit sets;
copying only a subset – and doing so atomically – is not
always possible or feasible.

*OpenACC* 3.1 has for "2.14.4  Update Directive" the restriction:

"Noncontiguous subarrays may appear. It is implementation-specific
 whether noncontiguous regions are updated by using one transfer
 for each contiguous subregion, or whether the noncontiguous data
 is packed, transferred once, and unpacked, or whether one or more
 larger subarrays (no larger than the smallest contiguous region
 that contains the specified subarray) are updated."

For map, I saw that that's the case – but I think Julian's
patch does not handle this correctly for:

type t
  integer :: i, j, k
end type t
type(t) :: A(100)
  ... host(A(:)%j)

I think instead of transferring A(1)%j to A(100)%j, it transfers
all of A(:), i.e. also A(1)%i and A(100)%k :-(

^– Julian?


For OpenMP and map, I recall encountering a code which did do
this for OpenMP (i.e. contiguous subsection). I think it was
related to derived-type 'map', but I do not recall anymore.


Looking at the *OpenMP* 5.1 spec, I see that 'target update' also
allows: "The list items that appear in the to or from clauses
may include array sections with stride expressions."
While for the map clause, there is:
'If a list item is an array section, it must specify contiguous
storage.'

But I did not see a more explicit description how that should be
handled, contrary to the rather explicit description for OpenACC.

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf

[committed] libstdc++: Make coroutine_handle<_Promise>::from_address() noexcept [PR 99021]

The coroutine_handle::from_address(void*) version is already
noexcept, and they do the same thing. Make them consistent.

libstdc++-v3/ChangeLog:

PR libstdc++/99021
* include/std/coroutine (coroutine_handle::from_address): Add
noexcept.

Tested x86_64-linux. Committed to trunk.

commit 26a3f288f1895a8c061c0458590542a3d2ee796a
Author: Jonathan Wakely 
Date:   Tue Feb 9 11:23:29 2021

libstdc++: Make coroutine_handle<_Promise>::from_address() noexcept [PR 
99021]

The coroutine_handle::from_address(void*) version is already
noexcept, and they do the same thing. Make them consistent.

libstdc++-v3/ChangeLog:

PR libstdc++/99021
* include/std/coroutine (coroutine_handle::from_address): Add
noexcept.

diff --git a/libstdc++-v3/include/std/coroutine 
b/libstdc++-v3/include/std/coroutine
index e69024caf4c..209deb7bb42 100644
--- a/libstdc++-v3/include/std/coroutine
+++ b/libstdc++-v3/include/std/coroutine
@@ -206,7 +206,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   constexpr void* address() const noexcept { return _M_fr_ptr; }
 
-  constexpr static coroutine_handle from_address(void* __a)
+  constexpr static coroutine_handle from_address(void* __a) noexcept
   {
coroutine_handle __self;
__self._M_fr_ptr = __a;

Re: [PATCH] ix86: Support V{2,4}DImode arithmetic right shifts for SSE2+ [PR98856]

On Tue, 9 Feb 2021, Jakub Jelinek wrote:

> On Tue, Feb 09, 2021 at 12:52:55PM +0100, Richard Biener wrote:
> > Yeah, it does look useful in the end.  Note that you might want
> > to adjust ix86_add_stmt_cost (or ix86_shift_rotate_cost, that is)
> > to reflect the complex expansion.
> 
> Yeah, the patch does that, see the i386.c hunks.
> 
> I guess for V2DImode vectorization, it will usually be a win only if the
> lack of the optab support would cause much larger loop not to be vectorized,
> but for V4DImode the scalar cost won't be that small already.

Due to how we cost loads and stores I guess even V2DImode vectorization of

long di[2];
void foo ()
{
 di[0] >>= 7;
 di[1] >>= 7;
}

will be considered profitable (scalar and vector loads/stores cost 12 
compared to the shift which costs 4 so we have a budget of 24 from
vectorizing the load/store we can eat from to make the vector shift
profitable).

Richard.

[PATCH] Fix O(region-size) unwind in VN

This fixes the currently O(region-size) unwinding of avail info
to be O(unwind-size) by tracking a linked-list stack of pushed
avails.  This reduces the compile-time spent in complete unrolling
for WRF and removes do_unwind from the radar of perf for the
testcase.

Bootstrapped and tested on x86_64-unknown-linux-gnu, will push soon.

2021-02-09  Richard Biener  

PR tree-optimization/98863
* tree-ssa-sccvn.h (vn_avail::next_undo): Add.
* tree-ssa-sccvn.c (last_pushed_avail): New global.
(rpo_elim::eliminate_push_avail): Chain pushed avails.
(unwind_state::avail_top): Add.
(do_unwind): Rewrite unwinding of avail entries.
(do_rpo_vn): Initialize last_pushed_avail and
avail_top of the undo state.
---
 gcc/tree-ssa-sccvn.c | 31 ---
 gcc/tree-ssa-sccvn.h |  2 ++
 2 files changed, 18 insertions(+), 15 deletions(-)

diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c
index d45aee8e502..65b3967b9e1 100644
--- a/gcc/tree-ssa-sccvn.c
+++ b/gcc/tree-ssa-sccvn.c
@@ -298,6 +298,7 @@ static obstack vn_tables_insert_obstack;
 static vn_reference_t last_inserted_ref;
 static vn_phi_t last_inserted_phi;
 static vn_nary_op_t last_inserted_nary;
+static vn_ssa_aux_t last_pushed_avail;
 
 /* Valid hashtables storing information we have proven to be
correct.  */
@@ -6898,6 +6899,8 @@ rpo_elim::eliminate_push_avail (basic_block bb, tree 
leader)
   av->location = bb->index;
   av->leader = SSA_NAME_VERSION (leader);
   av->next = value->avail;
+  av->next_undo = last_pushed_avail;
+  last_pushed_avail = value;
   value->avail = av;
 }
 
@@ -7338,12 +7341,13 @@ struct unwind_state
   vn_reference_t ref_top;
   vn_phi_t phi_top;
   vn_nary_op_t nary_top;
+  vn_avail *avail_top;
 };
 
 /* Unwind the RPO VN state for iteration.  */
 
 static void
-do_unwind (unwind_state *to, int rpo_idx, rpo_elim , int *bb_to_rpo)
+do_unwind (unwind_state *to, rpo_elim )
 {
   gcc_assert (to->iterate);
   for (; last_inserted_nary != to->nary_top;
@@ -7378,20 +7382,14 @@ do_unwind (unwind_state *to, int rpo_idx, rpo_elim 
, int *bb_to_rpo)
   obstack_free (_tables_obstack, to->ob_top);
 
   /* Prune [rpo_idx, ] from avail.  */
-  /* ???  This is O(number-of-values-in-region) which is
- O(region-size) rather than O(iteration-piece).  */
-  for (hash_table::iterator i = vn_ssa_aux_hash->begin ();
-   i != vn_ssa_aux_hash->end (); ++i)
+  for (; last_pushed_avail && last_pushed_avail->avail != to->avail_top;)
 {
-  while ((*i)->avail)
-   {
- if (bb_to_rpo[(*i)->avail->location] < rpo_idx)
-   break;
- vn_avail *av = (*i)->avail;
- (*i)->avail = (*i)->avail->next;
- av->next = avail.m_avail_freelist;
- avail.m_avail_freelist = av;
-   }
+  vn_ssa_aux_t val = last_pushed_avail;
+  vn_avail *av = val->avail;
+  val->avail = av->next;
+  last_pushed_avail = av->next_undo;
+  av->next = avail.m_avail_freelist;
+  avail.m_avail_freelist = av;
 }
 }
 
@@ -7505,6 +7503,7 @@ do_rpo_vn (function *fn, edge entry, bitmap exit_bbs,
   last_inserted_ref = NULL;
   last_inserted_phi = NULL;
   last_inserted_nary = NULL;
+  last_pushed_avail = NULL;
 
   vn_valueize = rpo_vn_valueize;
 
@@ -7598,6 +7597,8 @@ do_rpo_vn (function *fn, edge entry, bitmap exit_bbs,
rpo_state[idx].ref_top = last_inserted_ref;
rpo_state[idx].phi_top = last_inserted_phi;
rpo_state[idx].nary_top = last_inserted_nary;
+   rpo_state[idx].avail_top
+ = last_pushed_avail ? last_pushed_avail->avail : NULL;
  }
 
if (!(bb->flags & BB_EXECUTABLE))
@@ -7675,7 +7676,7 @@ do_rpo_vn (function *fn, edge entry, bitmap exit_bbs,
}
if (iterate_to != -1)
  {
-   do_unwind (_state[iterate_to], iterate_to, avail, bb_to_rpo);
+   do_unwind (_state[iterate_to], avail);
idx = iterate_to;
if (dump_file && (dump_flags & TDF_DETAILS))
  fprintf (dump_file, "Iterating to %d BB%d\n",
diff --git a/gcc/tree-ssa-sccvn.h b/gcc/tree-ssa-sccvn.h
index 94968de5d6c..e4f1ff1eb20 100644
--- a/gcc/tree-ssa-sccvn.h
+++ b/gcc/tree-ssa-sccvn.h
@@ -212,6 +212,8 @@ struct vn_avail
   int location;
   /* The LEADER for the value we are chained on.  */
   int leader;
+  /* The previous value we pushed a avail record to.  */
+  struct vn_ssa_aux *next_undo;
 };
 
 typedef struct vn_ssa_aux
-- 
2.26.2

Re: [PATCH] ix86: Support V{2,4}DImode arithmetic right shifts for SSE2+ [PR98856]

On Tue, Feb 09, 2021 at 12:52:55PM +0100, Richard Biener wrote:
> Yeah, it does look useful in the end.  Note that you might want
> to adjust ix86_add_stmt_cost (or ix86_shift_rotate_cost, that is)
> to reflect the complex expansion.

Yeah, the patch does that, see the i386.c hunks.

I guess for V2DImode vectorization, it will usually be a win only if the
lack of the optab support would cause much larger loop not to be vectorized,
but for V4DImode the scalar cost won't be that small already.

Jakub

Re: [Patch] Fortran: %re/%im fixes for OpenMP/OpenACC + gfc_is_simplify_contiguous

2021-02-09 Thread Thomas Schwinge

Hi Tobias!

On 2021-02-09T12:41:48+0100, Tobias Burnus  wrote:
> Updated patch. Changes: Testcases split + updated/extended.
> OK for mainline?

I don't have any further comments on the patch itself.

> On 09.02.21 10:45, Thomas Schwinge wrote:
>> Thanks for filing/locating these discussion items for OpenACC/OpenMP
>> upstream.  May also put these references into the testcases, so that once
>> these get addressed, we have something to 'grep' for in GCC?
>
> Actually, they are already in the file.

Heh, thanks, I see -- evidently I haven't been reading carefully...

> Alternative is to add them as
> link, but I am not sure that's better. (I moved them now to the top.)

ACK.  As no doubt you've noticed, I like to put in direct/explicit links
-- but that then occasionally has lead to people external to
OpenACC/OpenMP GitHub reporting that they opened these and got a 404
"Page not found", confusing.

>> I note that 'zz' variants (see below) are not being checked for OpenMP
>
> I have now added them; I had them before but as many checks triggered,
> I thought the tests were not really worthwhile.

(Your call, I suppose -- I just noted the inconistency.)

>>> +!$acc update self(zz%re)
>>> +!$acc update self(zz%im)
>>> +end

>> And for OpenACC, the 'zz' variants do not emit this error message here.
>> (That's not immediately obvious to me.)
>
> Answer: 'git add' missing.

Heh.  ;-)

> The reason is that with and without
> Julian's patch, the error message is different. Without his patch,
> the error is:
> !$acc update self(zz%re) ! { dg-error "not a proper array section" }

Meaning that he'll then to updated these with his patch.  (Julian CCed.)

>> I can see how data mapping of '[...]%re' etc. are problematic (we're
>> constructing an "incomplete object"?), but 'update' etc. I'd have
>> expected to work: would just copy the respective "part".

> Granted. The array(:)%re access might update too much, but that's not
> different to array with strides or with contiguous arrays sections
> which contain component reference (and more than one component).

(Is that indeed allowed to "update too much"?)

> But that's more a question for the spec committee – if it is supposed
> to work, the code needs to be updated.

ACK.


Grüße
 Thomas


> Fortran: %re/%im fixes for OpenMP/OpenACC + gfc_is_simplify_contiguous
>
> gcc/fortran/ChangeLog:
>
>   * expr.c (gfc_is_simplify_contiguous): Handle REF_INQUIRY, i.e.
>   %im and %re which are EXPR_VARIABLE.
>   * openmp.c (resolve_omp_clauses): Diagnose %re/%im explicitly.
>
> gcc/testsuite/ChangeLog:
>
>   * gfortran.dg/goacc/ref_inquiry.f90: New test.
>   * gfortran.dg/gomp/ref_inquiry.f90: New test.
>
>  gcc/fortran/expr.c  |  2 +
>  gcc/fortran/openmp.c|  8 
>  gcc/testsuite/gfortran.dg/goacc/ref_inquiry.f90 | 52 
> +
>  gcc/testsuite/gfortran.dg/gomp/ref_inquiry.f90  | 39 +++
>  4 files changed, 101 insertions(+)
>
> diff --git a/gcc/fortran/expr.c b/gcc/fortran/expr.c
> index 4f456fc629a..92a6700568d 100644
> --- a/gcc/fortran/expr.c
> +++ b/gcc/fortran/expr.c
> @@ -5854,6 +5854,8 @@ gfc_is_simply_contiguous (gfc_expr *expr, bool strict, 
> bool permit_element)
>   part_ref  = ref;
>else if (ref->type == REF_SUBSTRING)
>   return false;
> +  else if (ref->type == REF_INQUIRY)
> + return false;
>else if (ref->u.ar.type != AR_ELEMENT)
>   ar = >u.ar;
>  }
> diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
> index 797f6c86b62..f0307c801a5 100644
> --- a/gcc/fortran/openmp.c
> +++ b/gcc/fortran/openmp.c
> @@ -5219,6 +5219,14 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses 
> *omp_clauses,
>   && array_ref->next->type == REF_SUBSTRING)))
> gfc_error ("Unexpected substring reference in %s clause "
>"at %L", name, >where);
> + else if (array_ref && array_ref->type == REF_INQUIRY)
> +   {
> + gcc_assert (array_ref->u.i == INQUIRY_RE
> + || array_ref->u.i == INQUIRY_IM);
> + gfc_error ("Unexpected complex-parts designator "
> +"reference in %s clause at %L",
> +name, >where);
> +   }
>   else if (!resolved
>   || n->expr->expr_type != EXPR_VARIABLE
>   || array_ref->next
> diff --git a/gcc/testsuite/gfortran.dg/goacc/ref_inquiry.f90 
> b/gcc/testsuite/gfortran.dg/goacc/ref_inquiry.f90
> new file mode 100644
> index 000..274065dae94
> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/goacc/ref_inquiry.f90
> @@ -0,0 +1,52 @@
> +! Check for %re, ...%im, ...%kind, ...%len
> +! Cf. also OpenMP's ../gomp/ref_inquiry.f90
> +! Cf. OpenACC spec issue 346
> +!
> +implicit none
> +type t
> +  integer ::

Re: [PATCH] ix86: Support V{2,4}DImode arithmetic right shifts for SSE2+ [PR98856]

On Tue, 9 Feb 2021, Jakub Jelinek wrote:

> Hi!
> 
> As mentioned in the PR, we don't support arithmetic right V2DImode or
> V4DImode on x86 without -mavx512vl or -mxop.  The ISAs indeed don't have
> {,v}psraq instructions until AVX512VL, but we actually can emulate it quite
> easily.
> One case is arithmetic >> 63, we can just emit {,v}pxor; {,v}pcmpgt for
> that for SSE4.2+, or for SSE2 psrad $31; pshufd $0xf5.
> Then arithmetic >> by constant > 32, that can be done with {,v}psrad $31
> and {,v}psrad $(cst-32) and two operand permutation,
> arithmetic >> 32 can be done as {,v}psrad $31 and permutation of that
> and the original operand.  Arithmetic >> by constant < 32 can be done
> as {,v}psrad $cst and {,v}psrlq $cst and two operand permutation.
> And arithmetic >> by variable scalar amount can be done as
> arithmetic >> 63, logical >> by the amount, << by (64 - amount of the
> >> 63 result; note that the vector << 64 result in 0) and oring together.
> 
> I had to improve the permutation generation so that it actually handles
> the needed permutations (or handles them better).
> 
> Richard, does this actually improve the benchmark that regressed?

No, it doesn't improve the benchmark which, even when optimally
vectorized, is slower than scalar.  Note there's more fundamental
SLP support missing to actually get at vectorizing in this case
since on one path we have a "missing" * 1 operation.

> If not, I guess this is a GCC 12 material.

Yeah, it does look useful in the end.  Note that you might want
to adjust ix86_add_stmt_cost (or ix86_shift_rotate_cost, that is)
to reflect the complex expansion.

Richard.

> Bootstrapped/regtested on x86_64-linux and i686-linux.
> 
> 2021-02-09  Jakub Jelinek  
> 
>   PR tree-optimization/98856
>   * config/i386/i386.c (ix86_shift_rotate_cost): Add CODE argument.
>   Expect V2DI and V4DI arithmetic right shifts to be emulated.
>   (ix86_rtx_costs, ix86_add_stmt_cost): Adjust ix86_shift_rotate_cost
>   caller.
>   * config/i386/i386-expand.c (expand_vec_perm_2perm_interleave,
>   expand_vec_perm_2perm_pblendv): New functions.
>   (ix86_expand_vec_perm_const_1): Use them.
>   * config/i386/sse.md (ashr3): Rename to ...
>   (ashr3): ... this.
>   (ashr3): New define_expand with VI248_AVX512BW iterator.
>   (ashrv4di3): New define_expand.
>   (ashrv2di3): Change condition to TARGET_SSE2, handle !TARGET_XOP
>   and !TARGET_AVX512VL expansion.
> 
>   * gcc.target/i386/sse2-psraq-1.c: New test.
>   * gcc.target/i386/sse4_2-psraq-1.c: New test.
>   * gcc.target/i386/avx-psraq-1.c: New test.
>   * gcc.target/i386/avx2-psraq-1.c: New test.
>   * gcc.target/i386/avx-pr82370.c: Adjust expected number of vpsrad
>   instructions.
>   * gcc.target/i386/avx2-pr82370.c: Likewise.
>   * gcc.target/i386/avx512f-pr82370.c: Likewise.
>   * gcc.target/i386/avx512bw-pr82370.c: Likewise.
>   * gcc.dg/torture/vshuf-4.inc: Add two further permutations.
>   * gcc.dg/torture/vshuf-8.inc: Likewise.
> 
> --- gcc/config/i386/i386.c.jj 2021-02-08 19:07:21.869292064 +0100
> +++ gcc/config/i386/i386.c2021-02-09 10:12:30.934084039 +0100
> @@ -19689,6 +19689,7 @@ ix86_division_cost (const struct process
>  
>  static int
>  ix86_shift_rotate_cost (const struct processor_costs *cost,
> + enum rtx_code code,
>   enum machine_mode mode, bool constant_op1,
>   HOST_WIDE_INT op1_val,
>   bool speed,
> @@ -19727,6 +19728,19 @@ ix86_shift_rotate_cost (const struct pro
>   count = 7;
> return ix86_vec_cost (mode, cost->sse_op * count);
>   }
> +  /* V*DImode arithmetic right shift is emulated.  */
> +  else if (code == ASHIFTRT
> +&& (mode == V2DImode || mode == V4DImode)
> +&& !TARGET_XOP
> +&& !TARGET_AVX512VL)
> + {
> +   int count = 4;
> +   if (constant_op1 && op1_val == 63 && TARGET_SSE4_2)
> + count = 2;
> +   else if (constant_op1)
> + count = 3;
> +   return ix86_vec_cost (mode, cost->sse_op * count);
> + }
>else
>   return ix86_vec_cost (mode, cost->sse_op);
>  }
> @@ -19896,13 +19910,15 @@ ix86_rtx_costs (rtx x, machine_mode mode
>  case LSHIFTRT:
>  case ROTATERT:
>bool skip_op0, skip_op1;
> -  *total = ix86_shift_rotate_cost (cost, mode, CONSTANT_P (XEXP (x, 1)),
> +  *total = ix86_shift_rotate_cost (cost, code, mode,
> +CONSTANT_P (XEXP (x, 1)),
>  CONST_INT_P (XEXP (x, 1))
>? INTVAL (XEXP (x, 1)) : -1,
>  speed,
>  GET_CODE (XEXP (x, 1)) == AND,
>  SUBREG_P (XEXP (x, 1))
> -&& GET_CODE (XEXP (XEXP (x, 1), 0)) == 
> AND,
> +

[Patch] Fortran: Fix coarray handling for gfc_dep_resolver [PR99010] (was: Re: [Patch] Fortran: Fix Array dependency with local coarrays [PR98913]


Hi all, hi Thomas,

On 02.02.21 18:15, Tobias Burnus wrote:

I think I will do a combination: If 'identical' is true, I think I
cannot remove it. If it is false, it can be identical or nonoverlapping
– which makes sense.


Well, it turned out that coarray scalars did not work; for them,
the array ref consists only of the coindexed access: falling through
then triggered as fin_dep == GFC_DEP_ERROR.

To be more careful, I also removed the 'identical &&' check such that
the first loop is now always triggered if coarray != SINGLE not only
if identical – I am not sure whether it is really needed or whether
falling though is the better solution (with updating the comment).

I would be happy if someone could carefully check the logic –
in the details, it is rather confusing.

OK?

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf
Fortran: Fix coarray handling for gfc_dep_resolver [PR99010]

Check failed if identical = false was requested or for -fcoarray=single
if an array ref was for a coindexed scalar.

gcc/fortran/ChangeLog:

	PR fortran/99010
	* dependency.c (gfc_dep_resolver): Fix coarray handling.

gcc/testsuite/ChangeLog:

	PR fortran/99010
	* gfortran.dg/coarray/array_temporary-1.f90: New test.

 gcc/fortran/dependency.c| 14 +++---
 gcc/testsuite/gfortran.dg/coarray/array_temporary-1.f90 | 13 +
 2 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/gcc/fortran/dependency.c b/gcc/fortran/dependency.c
index 5de3b2cf520..e1336e1c654 100644
--- a/gcc/fortran/dependency.c
+++ b/gcc/fortran/dependency.c
@@ -2145,9 +2145,9 @@ gfc_dep_resolver (gfc_ref *lref, gfc_ref *rref, gfc_reverse *reverse,
 	case REF_ARRAY:
 	  /* Coarrays: If there is a coindex, either the image differs and there
 	 is no overlap or the image is the same - then the normal analysis
-	 applies.  Hence, return early only if 'identical' is required and
-	 either ref is coindexed and more than one image can exist.  */
-	  if (identical && flag_coarray != GFC_FCOARRAY_SINGLE
+	 applies.  Hence, return early if either ref is coindexed and more
+	 than one image can exist.  */
+	  if (flag_coarray != GFC_FCOARRAY_SINGLE
 	  && ((lref->u.ar.codimen
 		   && lref->u.ar.dimen_type[lref->u.ar.dimen]
 		  != DIMEN_THIS_IMAGE)
@@ -2155,6 +2155,14 @@ gfc_dep_resolver (gfc_ref *lref, gfc_ref *rref, gfc_reverse *reverse,
 		  && lref->u.ar.dimen_type[lref->u.ar.dimen]
 			 != DIMEN_THIS_IMAGE)))
 	return 1;
+	  if (lref->u.ar.dimen == 0 || rref->u.ar.dimen == 0)
+	{
+	  /* Coindexed scalar coarray with GFC_FCOARRAY_SINGLE.  */
+	  if (lref->u.ar.dimen || rref->u.ar.dimen)
+		return 1;  /* Just to be sure.  */
+	  fin_dep = GFC_DEP_EQUAL;
+	  break;
+	}
 
 	  if (ref_same_as_full_array (lref, rref))
 	return identical;
diff --git a/gcc/testsuite/gfortran.dg/coarray/array_temporary-1.f90 b/gcc/testsuite/gfortran.dg/coarray/array_temporary-1.f90
new file mode 100644
index 000..454929cf53b
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/coarray/array_temporary-1.f90
@@ -0,0 +1,13 @@
+! PR fortran/99010
+!
+! Follow-up to PR fortran/98913
+!
+! Contributed by G. Steinmetz
+!
+program p
+   integer :: x[*]
+   x = this_image()
+   if ( this_image() == 2 ) then
+  x = x[1]
+   end if
+end

Re: [Patch] Fortran: %re/%im fixes for OpenMP/OpenACC + gfc_is_simplify_contiguous


Hi Thomas, hi all

Updated patch. Changes: Testcases split + updated/extended.
OK for mainline?

Regarding the comments:

On 09.02.21 10:45, Thomas Schwinge wrote:

Thanks for filing/locating these discussion items for OpenACC/OpenMP
upstream.  May also put these references into the testcases, so that once
these get addressed, we have something to 'grep' for in GCC?


Actually, they are already in the file. Alternative is to add them as
link, but I am not sure that's better. (I moved them now to the top.)


I note that 'zz' variants (see below) are not being checked for OpenMP


I have now added them; I had them before but as many checks triggered,
I thought the tests were not really worthwhile.


+!$acc update self(zz%re)
+!$acc update self(zz%im)
+end

And for OpenACC, the 'zz' variants do not emit this error message here.
(That's not immediately obvious to me.)


Answer: 'git add' missing. The reason is that with and without
Julian's patch, the error message is different. Without his patch,
the error is:
!$acc update self(zz%re) ! { dg-error "not a proper array section" }


I can see how data mapping of '[...]%re' etc. are problematic (we're
constructing an "incomplete object"?), but 'update' etc. I'd have
expected to work: would just copy the respective "part".

Granted. The array(:)%re access might update too much, but that's not
different to array with strides or with contiguous arrays sections
which contain component reference (and more than one component).

But that's more a question for the spec committee – if it is supposed
to work, the code needs to be updated.

Tobias



-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf
Fortran: %re/%im fixes for OpenMP/OpenACC + gfc_is_simplify_contiguous

gcc/fortran/ChangeLog:

	* expr.c (gfc_is_simplify_contiguous): Handle REF_INQUIRY, i.e.
	%im and %re which are EXPR_VARIABLE.
	* openmp.c (resolve_omp_clauses): Diagnose %re/%im explicitly.

gcc/testsuite/ChangeLog:

	* gfortran.dg/goacc/ref_inquiry.f90: New test.
	* gfortran.dg/gomp/ref_inquiry.f90: New test.

 gcc/fortran/expr.c  |  2 +
 gcc/fortran/openmp.c|  8 
 gcc/testsuite/gfortran.dg/goacc/ref_inquiry.f90 | 52 +
 gcc/testsuite/gfortran.dg/gomp/ref_inquiry.f90  | 39 +++
 4 files changed, 101 insertions(+)

diff --git a/gcc/fortran/expr.c b/gcc/fortran/expr.c
index 4f456fc629a..92a6700568d 100644
--- a/gcc/fortran/expr.c
+++ b/gcc/fortran/expr.c
@@ -5854,6 +5854,8 @@ gfc_is_simply_contiguous (gfc_expr *expr, bool strict, bool permit_element)
 	part_ref  = ref;
   else if (ref->type == REF_SUBSTRING)
 	return false;
+  else if (ref->type == REF_INQUIRY)
+	return false;
   else if (ref->u.ar.type != AR_ELEMENT)
 	ar = >u.ar;
 }
diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index 797f6c86b62..f0307c801a5 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -5219,6 +5219,14 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
 && array_ref->next->type == REF_SUBSTRING)))
 		  gfc_error ("Unexpected substring reference in %s clause "
  "at %L", name, >where);
+		else if (array_ref && array_ref->type == REF_INQUIRY)
+		  {
+			gcc_assert (array_ref->u.i == INQUIRY_RE
+|| array_ref->u.i == INQUIRY_IM);
+			gfc_error ("Unexpected complex-parts designator "
+   "reference in %s clause at %L",
+   name, >where);
+		  }
 		else if (!resolved
 			|| n->expr->expr_type != EXPR_VARIABLE
 			|| array_ref->next
diff --git a/gcc/testsuite/gfortran.dg/goacc/ref_inquiry.f90 b/gcc/testsuite/gfortran.dg/goacc/ref_inquiry.f90
new file mode 100644
index 000..274065dae94
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/ref_inquiry.f90
@@ -0,0 +1,52 @@
+! Check for %re, ...%im, ...%kind, ...%len
+! Cf. also OpenMP's ../gomp/ref_inquiry.f90
+! Cf. OpenACC spec issue 346
+! 
+implicit none
+type t
+  integer :: i
+  character :: c
+  complex :: z
+  complex :: zz(5)
+end type t
+
+integer :: i
+character(kind=4, len=5) :: c
+complex :: z, zz(5)
+type(t) :: x
+
+print *, is_contiguous(zz(:)%re)
+
+! inquiry function; expr_type != EXPR_VARIABLE:
+!$acc enter data copyin(i%kind, c%len) ! { dg-error "not a proper array section" }
+!$acc enter data copyin(x%i%kind)  ! { dg-error "not a proper array section" }
+!$acc enter data copyin(x%c%len)   ! { dg-error "not a proper array section" }
+!$acc update self(i%kind, c%len)   ! { dg-error "not a proper array section" }
+!$acc update self(x%i%kind)! { dg-error "not a proper array section" }
+!$acc update self(x%c%len) ! { dg-error "not a proper array section" }
+
+! EXPR_VARIABLE
+!$acc enter data copyin(z%re)! { dg-error "Unexpected complex-parts designator" }
+!$acc enter data copyin(z%im)!

New template for 'gcc' made available

Hello, gentle maintainer.

This is a message from the Translation Project robot.  (If you have
any questions, send them to .)

A new POT file for textual domain 'gcc' has been made available
to the language teams for translation.  It is archived as:

https://translationproject.org/POT-files/gcc-11.1-b20210207.pot

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

Below is the URL which has been provided to the translators of your
package.  Please inform the translation coordinator, at the address
at the bottom, if this information is not current:

https://gcc.gnu.org/pub/gcc/snapshots/11-20210207/gcc-11-20210207.tar.xz

Translated PO files will later be automatically e-mailed to you.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.

Re: [PATCH] string: Add a workaround for -Wstringop-overread false positives [PR98465]


On 09/02/21 09:57 +0100, Jakub Jelinek wrote:

Hi!

In the PR there are several possibilities how to improve _M_disjunct at
least in certain cases so that the compiler can figure out at least in some
cases where __s is provably disjunct from _M_data() ... _M_data() + this->size()
but it is probably GCC 12 material.

The false positive warning is on this particular copy, which is done for
non-disjunct pointers when __len2 > __len1 and the __s >= __p + __len1,
i.e. __s used to point to the characters moved through _S_move a few lines 
earlier
by __len2 - __len1 characters up to make space.  That is why the
_S_copy source is __s + __len2 - __len1.  Unfortunately, when the compiler
can't prove objects are disjunct, that copying from __s + __len2 - __len1
of __len2 characters can very well mean accessing characters the source
object (if it is not disjunct) provably can't have.

The following patch works around that by making the _S_copy be a __p based
pointer instead of __s based pointer.
__s + __len2 - __len1
and
__p + (__s - __p) + (__len2 - __len1)
have the same value and the latter may seem to be uselessly longer,
but it seems at least currently in GIMPLE we keep it that way and so that is
what the warning code during expansion will see, and only actually
optimize it to __s + __len2 - __len1 during RTL when we lose information
on what is a pointer and what is a mere offset with the same mode.

So, in the end we emit exactly the same assembly, just without the false
positive warning.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-02-09  Jakub Jelinek  

PR middle-end/98465
* include/bits/basic_string.tcc (basic_string::_M_replace): When __s
points to the characters moved by earlier _S_move, compute the source
address using expression based on the __p pointer rather than __s
pointer.

* g++.dg/warn/Wstringop-overread-1.C: New test.

--- libstdc++-v3/include/bits/basic_string.tcc.jj   2021-02-04 
18:15:05.317113098 +0100
+++ libstdc++-v3/include/bits/basic_string.tcc  2021-02-08 19:39:16.864936344 
+0100
@@ -478,7 +478,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  if (__s + __len2 <= __p + __len1)
this->_S_move(__p, __s, __len2);
  else if (__s >= __p + __len1)
-   this->_S_copy(__p, __s + __len2 - __len1, __len2);
+   {


Please add:

// Hint to middle end that __p and __s overlap (PR 98465)


OK with that comment added, thanks.


+ const size_type __poff = (__s - __p) + (__len2 - __len1);
+ this->_S_copy(__p, __p + __poff, __len2);
+   }
  else
{
  const size_type __nleft = (__p + __len1) - __s;
--- gcc/testsuite/g++.dg/warn/Wstringop-overread-1.C.jj 2021-02-08 
19:47:15.486646129 +0100
+++ gcc/testsuite/g++.dg/warn/Wstringop-overread-1.C2021-02-08 
19:46:16.118302317 +0100
@@ -0,0 +1,12 @@
+// PR middle-end/98994
+// { dg-do compile }
+// { dg-additional-options "-Wstringop-overread -O2" }
+
+#include 
+
+const char constantString[] = {42, 53};
+
+void f(std::string& s)
+{
+  s.insert(0, static_cast(constantString), 2);
+}

Jakub

Re: [PATCH] c++: Private parent access check for using decls [PR19377]

2021-02-09 Thread Anthony Sharp via Gcc-patches

>
> The parens are to help the editor line up the last line properly.  If
> the subexpression fits on one line, the parens aren't needed.


A okay.

No; "compile" means translate from C++ to assembly, "assemble" means
> that, plus call the assembler to produce an object file.  Though since
> compilation errors out, assembling never happens.


lol I was being dumb and thinking it was the other way around. I will
change it.

  You could bring back your lookup from the earlier patch and look
> through the result to see if the function we're complaining about is
> part of what the particular using-decl brings in.


I will give that a try.

I think you want
> SAME_BINFO_TYPE_P (context_for_name_lookup (decl),
> BINFO_TYPE (parent_binfo))


Am I reading this wrong then? :

/* Compare a BINFO_TYPE with another type for equality.
For a binfo, this is functionally equivalent to using same_type_p, but
measurably faster.
 At least one of the arguments must be a BINFO_TYPE.  The other can be a
BINFO_TYPE
or a regular type.  If BINFO_TYPE(T) ever stops being the main variant of
the class the
binfo is for, this macro must change.  */
#define SAME_BINFO_TYPE_P(A, B) ((A) == (B))

That leads me to believe that arguments A and B can be: BINFO, BINFO ... or
BINFO, TYPE ... or TYPE, BINFO. If so I don't see why this doesn't work:

(SAME_BINFO_TYPE_P (context_for_name_lookup (decl), parent_binfo))
and
(SAME_BINFO_TYPE_P (TYPE_BINFO (context_for_name_lookup (decl)),
parent_binfo))

Unless "BINFO_TYPE" is actually just how you refer to a regular class type
and not a BINFO. Seems a bit confusing to me.

This line needs one more space.


Oh I see ... so second line's arguments need to line up with the first
argument, not the first (.

I will send over a new patch later/tomorrow.

Anthony

On Tue, 9 Feb 2021 at 04:55, Jason Merrill  wrote:

> On 2/5/21 12:28 PM, Anthony Sharp wrote:
> > Hi Marek,
> >
> >>> +  if ((TREE_CODE (parent_field) == USING_DECL)
> >>
> >> This first term doesn't need to be wrapped in ().
> >
> > I normally wouldn't use the (), but I think that's how Jason likes it
> > so that's why I did it. I guess it makes it more readable.
>
> Ah, no, I don't see any need for the extra () there.  When I asked for
> added parens previously:
>
> >> +   if (parent_binfo != NULL_TREE
> >> +   && context_for_name_lookup (decl)
> >> +   != BINFO_TYPE (parent_binfo))
> >
> > Here you want parens around the second != expression and its != token
> > aligned with "context"
>
> The parens are to help the editor line up the last line properly.  If
> the subexpression fits on one line, the parens aren't needed.
>
> >> We usually use dg-do compile.
> >
> > True, but wouldn't that technically be slower? I would agree if it
> > were more than just error-handling code.
>
> No; "compile" means translate from C++ to assembly, "assemble" means
> that, plus call the assembler to produce an object file.  Though since
> compilation errors out, assembling never happens.
>
> > +  if ((TREE_CODE (parent_field) == USING_DECL)
> > +&& TREE_PRIVATE (parent_field)
> > +&& (DECL_NAME (decl) == DECL_NAME (parent_field)))
> > +   return parent_field;
>
> Following our discussion about an earlier revision, this will often
> produce the right using-declaration, but might not if two functions of
> the same name are imported from different bases.  If I split the
> using-declaration in using9.C into two, i.e.
>
> >   using A2::i; // { dg-message "declared" }
>
> >   using A::i;
>
> then I get
>
> > wa.ii: In member function ‘void C::f()’:
> > wa.ii:28:7: error: ‘int A::i()’ is private within this context
> >28 | i();
> >   |   ^
> > wa.ii:20:13: note: declared private here
> >20 |   using A2::i;
>
> pointing out the wrong using-declaration because it happens to be first.
>   You could bring back your lookup from the earlier patch and look
> through the result to see if the function we're complaining about is
> part of what the particular using-decl brings in.
>
> > I tried both:
> > (SAME_BINFO_TYPE_P (context_for_name_lookup (decl), parent_binfo))
> > and
> > (SAME_BINFO_TYPE_P (TYPE_BINFO (context_for_name_lookup (decl)),
> parent_binfo))
>
> I think you want
>
> SAME_BINFO_TYPE_P (context_for_name_lookup (decl),
> BINFO_TYPE (parent_binfo))
>
> i.e. just change same_type_p to SAME_BINFO_TYPE_P.
>
> >   tree parent_binfo = get_parent_with_private_access (decl,
> > -
>  basetype_path);
> > +
> basetype_path);
>
> This line was indented properly before, and is changed to be indented
> one space too far.
>
> > + diag_location = get_class_access_diagnostic_decl
> (parent_binfo,
> > +
> diag_decl);
>
> This line needs one more space.
>
> >   complain_about_access (decl, diag_decl, diag_location, true,
> > -parent_access);
> > +   access_failure_reason);
>
> This

Re: [PATCH] Add unordered containers heterogeneous lookup


On 04/02/21 21:20 +0100, François Dumont wrote:
    Considering that most of the code is already new code it doesn't 
look like a major tradeoff duplicate some more code.


    So here is a new proposal with only new code. However I left the 
new few XXX_tr methods accessible even in C++11 because I'll use them 
to propose a fix for PR 98066 when back in stage 1.


OK.


    libstdc++: Add unordered containers heterogeneous lookup

    Add unordered containers heterogeneous lookup member functions 
find, count, contains and
    equal_range in C++20. Those members are considered for overload 
resolution only if hash and
    equal functors used to instantiate the container have a nested 
is_transparent type.


    libstdc++-v3/ChangeLog:

    * include/bits/stl_tree.h
    (__has_is_transparent, __has_is_transparent_t): Move...
    * include/bits/stl_function.h: ...here.
    * include/bits/hashtable_policy.h 
(_Hash_code_base<>::_M_hash_code_tr): New..

    (_Hashtable_base<>::_M_equals_tr): New.
    * include/bits/hashtable.h (_Hashtable<>::_M_find_tr, 
_Hashtable<>::_M_count_tr,
    _Hashtable<>::_M_equal_range_tr): New member function 
templates to perform

    heterogeneous lookup.
    (_Hashtable<>::_M_find_before_node_tr): New.
    (_Hashtable<>::_M_find_node_tr): New.
    * include/bits/unordered_map.h (unordered_map::find<>, 
unordered_map::count<>,
    unordered_map::contains<>, unordered_map::equal_range<>): 
New member function

    templates to perform heterogeneous lookup.
    (unordered_multimap::find<>, unordered_multimap::count<>,
    unordered_multimap::contains<>, 
unordered_multimap::equal_range<>): Likewise.
    * include/bits/unordered_set.h (unordered_set::find<>, 
unordered_set::count<>,
    unordered_set::contains<>, unordered_set::equal_range<>): 
Likewise.

    (unordered_multiset::find<>, unordered_multiset::count<>,
    unordered_multiset::contains<>, 
unordered_multiset::equal_range<>): Likewise.

    * include/debug/unordered_map
    (unordered_map::find<>, unordered_map::equal_range<>): 
Likewise.
    (unordered_multimap::find<>, 
unordered_multimap::equal_range<>): Likewise.

    * include/debug/unordered_set
    (unordered_set::find<>, unordered_set::equal_range<>): 
Likewise.
    (unordered_multiset::find<>, 
unordered_multiset::equal_range<>): Likewise.
    * testsuite/23_containers/unordered_map/operations/1.cc: 
New test.
    * 
testsuite/23_containers/unordered_multimap/operations/1.cc: New test.
    * 
testsuite/23_containers/unordered_multiset/operations/1.cc: New test.
    * testsuite/23_containers/unordered_set/operations/1.cc: 
New test.


New test run under Linux x86_64.

Ok to commit if all tests pass ?


Yes, thanks for reworking it to only add new code.

Re: [PATCH][Bug libstdc++/70303] Value-initialized debug iterators


On 08/02/21 22:27 +0100, François Dumont wrote:

On 01/02/21 8:09 pm, Jonathan Wakely wrote:

On 01/02/21 19:30 +0100, François Dumont via Libstdc++ wrote:

On 01/02/21 6:43 pm, Jonathan Wakely wrote:

On 31/01/21 16:59 +0100, François Dumont via Libstdc++ wrote:
After the debug issue has been fixed in PR 98466 the problem 
was not in the debug iterator implementation itself but in the 
deque iterator operator- implementation.


    libstdc++: Make deque iterator operator- usable with 
value-init iterators


    N3644 implies that operator- can be used on value-init 
iterators. We now return
    0 if both iterators are value initialized. If only one is 
value initialized we
    keep the UB by returning the result of a normal 
computation which is an unexpected

    value.

    libstdc++/ChangeLog:

    PR libstdc++/70303
    * include/bits/stl_deque.h 
(std::deque<>::operator-(iterator, iterator)):

    Return 0 if both iterators are value-initialized.
    * testsuite/23_containers/deque/70303.cc: New test.
    * testsuite/23_containers/vector/70303.cc: New test.

Tested under Linux x86_64, ok to commit ?


OK.

I don't like adding the branch there though. Even with the
__builtin_expect it causes worse code to be generated than the
original.

This would be branchless, but a bit harder to understand:

    return difference_type(__x._S_buffer_size())
  * (__x._M_node - __y._M_node - int(__x._M_node == __y._M_node))
  + (__x._M_cur - __x._M_first) + (__y._M_last - __y._M_cur);

Please commit the fix and we can think about it later.


I do not think this work because for value-init iterators we have 
__x._M_node == __y._M_node == nullptr so the diff would give 
-_S_buffer_size().


But I got the idear, I'll prepare the patch.


Yeah, I just came back to the computer to say that it's wrong. But
maybe this:

    return difference_type(_S_buffer_size())
  * (__x._M_node - __y._M_node - int(__x._M_node && __y._M_node))
  + (__x._M_cur - __x._M_first) + (__y._M_last - __y._M_cur);

For value-init'd iterators we'd get _S_buffer_size() * 0 + 0 - 0

We could even just use int(__x._M_node != 0) because if one is null
and the other isn't, it's UB anyway.


This is what I've tested with success. As it is a recent kind of 
regression you might want to consider it now.


    libstdc++: Remove execution branch in deque iterator operator-

    libstdc++-v3/ChangeLog:

    * include/bits/stl_deque.h
    (std::operator-(deque::iterator, deque::iterator)): 
Replace if/then with

    a null pointer test.

Ok to commit ?


OK, thanks!



François




diff --git a/libstdc++-v3/include/bits/stl_deque.h 
b/libstdc++-v3/include/bits/stl_deque.h
index 04b70b77621..8bba7a3740f 100644
--- a/libstdc++-v3/include/bits/stl_deque.h
+++ b/libstdc++-v3/include/bits/stl_deque.h
@@ -352,12 +352,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
  friend difference_type
  operator-(const _Self& __x, const _Self& __y) _GLIBCXX_NOEXCEPT
  {
-   if (__builtin_expect(__x._M_node || __y._M_node, true))
- return difference_type(_S_buffer_size())
-   * (__x._M_node - __y._M_node - 1) + (__x._M_cur - __x._M_first)
-   + (__y._M_last - __y._M_cur);
-
-   return 0;
+   return difference_type(_S_buffer_size())
+ * (__x._M_node - __y._M_node - int(__x._M_node != 0))
+ + (__x._M_cur - __x._M_first)
+ + (__y._M_last - __y._M_cur);
  }

  // _GLIBCXX_RESOLVE_LIB_DEFECTS
@@ -369,12 +367,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
operator-(const _Self& __x,
  const _Deque_iterator<_Tp, _RefR, _PtrR>& __y) 
_GLIBCXX_NOEXCEPT
{
- if (__builtin_expect(__x._M_node || __y._M_node, true))
-   return difference_type(_S_buffer_size())
- * (__x._M_node - __y._M_node - 1) + (__x._M_cur - __x._M_first)
- + (__y._M_last - __y._M_cur);
-
- return 0;
+ return difference_type(_S_buffer_size())
+   * (__x._M_node - __y._M_node - int(__x._M_node != 0))
+   + (__x._M_cur - __x._M_first)
+   + (__y._M_last - __y._M_cur);
}

  friend _Self

[PATCH] ix86: Support V{2, 4}DImode arithmetic right shifts for SSE2+ [PR98856]

Hi!

As mentioned in the PR, we don't support arithmetic right V2DImode or
V4DImode on x86 without -mavx512vl or -mxop.  The ISAs indeed don't have
{,v}psraq instructions until AVX512VL, but we actually can emulate it quite
easily.
One case is arithmetic >> 63, we can just emit {,v}pxor; {,v}pcmpgt for
that for SSE4.2+, or for SSE2 psrad $31; pshufd $0xf5.
Then arithmetic >> by constant > 32, that can be done with {,v}psrad $31
and {,v}psrad $(cst-32) and two operand permutation,
arithmetic >> 32 can be done as {,v}psrad $31 and permutation of that
and the original operand.  Arithmetic >> by constant < 32 can be done
as {,v}psrad $cst and {,v}psrlq $cst and two operand permutation.
And arithmetic >> by variable scalar amount can be done as
arithmetic >> 63, logical >> by the amount, << by (64 - amount of the
>> 63 result; note that the vector << 64 result in 0) and oring together.

I had to improve the permutation generation so that it actually handles
the needed permutations (or handles them better).

Richard, does this actually improve the benchmark that regressed?

If not, I guess this is a GCC 12 material.

Bootstrapped/regtested on x86_64-linux and i686-linux.

2021-02-09  Jakub Jelinek  

PR tree-optimization/98856
* config/i386/i386.c (ix86_shift_rotate_cost): Add CODE argument.
Expect V2DI and V4DI arithmetic right shifts to be emulated.
(ix86_rtx_costs, ix86_add_stmt_cost): Adjust ix86_shift_rotate_cost
caller.
* config/i386/i386-expand.c (expand_vec_perm_2perm_interleave,
expand_vec_perm_2perm_pblendv): New functions.
(ix86_expand_vec_perm_const_1): Use them.
* config/i386/sse.md (ashr3): Rename to ...
(ashr3): ... this.
(ashr3): New define_expand with VI248_AVX512BW iterator.
(ashrv4di3): New define_expand.
(ashrv2di3): Change condition to TARGET_SSE2, handle !TARGET_XOP
and !TARGET_AVX512VL expansion.

* gcc.target/i386/sse2-psraq-1.c: New test.
* gcc.target/i386/sse4_2-psraq-1.c: New test.
* gcc.target/i386/avx-psraq-1.c: New test.
* gcc.target/i386/avx2-psraq-1.c: New test.
* gcc.target/i386/avx-pr82370.c: Adjust expected number of vpsrad
instructions.
* gcc.target/i386/avx2-pr82370.c: Likewise.
* gcc.target/i386/avx512f-pr82370.c: Likewise.
* gcc.target/i386/avx512bw-pr82370.c: Likewise.
* gcc.dg/torture/vshuf-4.inc: Add two further permutations.
* gcc.dg/torture/vshuf-8.inc: Likewise.

--- gcc/config/i386/i386.c.jj   2021-02-08 19:07:21.869292064 +0100
+++ gcc/config/i386/i386.c  2021-02-09 10:12:30.934084039 +0100
@@ -19689,6 +19689,7 @@ ix86_division_cost (const struct process
 
 static int
 ix86_shift_rotate_cost (const struct processor_costs *cost,
+   enum rtx_code code,
enum machine_mode mode, bool constant_op1,
HOST_WIDE_INT op1_val,
bool speed,
@@ -19727,6 +19728,19 @@ ix86_shift_rotate_cost (const struct pro
count = 7;
  return ix86_vec_cost (mode, cost->sse_op * count);
}
+  /* V*DImode arithmetic right shift is emulated.  */
+  else if (code == ASHIFTRT
+  && (mode == V2DImode || mode == V4DImode)
+  && !TARGET_XOP
+  && !TARGET_AVX512VL)
+   {
+ int count = 4;
+ if (constant_op1 && op1_val == 63 && TARGET_SSE4_2)
+   count = 2;
+ else if (constant_op1)
+   count = 3;
+ return ix86_vec_cost (mode, cost->sse_op * count);
+   }
   else
return ix86_vec_cost (mode, cost->sse_op);
 }
@@ -19896,13 +19910,15 @@ ix86_rtx_costs (rtx x, machine_mode mode
 case LSHIFTRT:
 case ROTATERT:
   bool skip_op0, skip_op1;
-  *total = ix86_shift_rotate_cost (cost, mode, CONSTANT_P (XEXP (x, 1)),
+  *total = ix86_shift_rotate_cost (cost, code, mode,
+  CONSTANT_P (XEXP (x, 1)),
   CONST_INT_P (XEXP (x, 1))
 ? INTVAL (XEXP (x, 1)) : -1,
   speed,
   GET_CODE (XEXP (x, 1)) == AND,
   SUBREG_P (XEXP (x, 1))
-  && GET_CODE (XEXP (XEXP (x, 1), 0)) == 
AND,
+  && GET_CODE (XEXP (XEXP (x, 1),
+ 0)) == AND,
   _op0, _op1);
   if (skip_op0 || skip_op1)
{
@@ -22335,11 +22351,16 @@ ix86_add_stmt_cost (class vec_info *vinf
case LROTATE_EXPR:
case RROTATE_EXPR:
  {
+   tree op1 = gimple_assign_rhs1 (stmt_info->stmt);
tree op2 = gimple_assign_rhs2 (stmt_info->stmt);
stmt_cost = ix86_shift_rotate_cost
-

New Ukrainian PO file for 'cpplib' (version 11.1-b20210207)

Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'cpplib' has been submitted
by the Ukrainian team of translators.  The file is available at:

https://translationproject.org/latest/cpplib/uk.po

(This file, 'cpplib-11.1-b20210207.uk.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

https://translationproject.org/latest/cpplib/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

https://translationproject.org/domain/cpplib.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.

Contents of PO file 'cpplib-11.1-b20210207.uk.po'



cpplib-11.1-b20210207.uk.po.gz
Description: Binary data
The Translation Project robot, in the
name of your translation coordinator.

[PATCH] reduce sparseset memory requirement

Currently we use HOST_WIDEST_FAST_INT for the sparseset element
type which maps to a 64bit type on 64bit hosts.  That's excessive
for the only current sparseset users which are LRA and IRA and
which store register numbers in it which are unsigned int.  The
following changes the sparseset element type to unsigned int.

This was changed (in accident?) in 0263463dd114 and the following
just reverts that bit.

Bootstrap / regtest pending on x86_64-unknown-linux-gnu, OK?

Thanks,
Richard.

2021-02-09  Richard Biener  

* sparseset.h (SPARSESET_ELT_BITS): Remove.
(SPARSESET_ELT_TYPE): Use unsigned int.
* fwprop.c: Do not include sparseset.h.
---
 gcc/fwprop.c| 1 -
 gcc/sparseset.h | 3 +--
 2 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/gcc/fwprop.c b/gcc/fwprop.c
index 123cc228630..4b8a554e823 100644
--- a/gcc/fwprop.c
+++ b/gcc/fwprop.c
@@ -28,7 +28,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "df.h"
 #include "rtl-ssa.h"
 
-#include "sparseset.h"
 #include "predict.h"
 #include "cfgrtl.h"
 #include "cfgcleanup.h"
diff --git a/gcc/sparseset.h b/gcc/sparseset.h
index c72b4fe8aed..536d35c51af 100644
--- a/gcc/sparseset.h
+++ b/gcc/sparseset.h
@@ -83,8 +83,7 @@ along with GCC; see the file COPYING3.  If not see
 
 /* Data Structure used for the SparseSet representation.  */
 
-#define SPARSESET_ELT_BITS ((unsigned) HOST_BITS_PER_WIDEST_FAST_INT)
-#define SPARSESET_ELT_TYPE unsigned HOST_WIDEST_FAST_INT
+#define SPARSESET_ELT_TYPE unsigned int
 
 typedef struct sparseset_def
 {
-- 
2.26.2

New template for 'cpplib' made available

Hello, gentle maintainer.

This is a message from the Translation Project robot.  (If you have
any questions, send them to .)

A new POT file for textual domain 'cpplib' has been made available
to the language teams for translation.  It is archived as:

https://translationproject.org/POT-files/cpplib-11.1-b20210207.pot

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

Below is the URL which has been provided to the translators of your
package.  Please inform the translation coordinator, at the address
at the bottom, if this information is not current:

https://gcc.gnu.org/pub/gcc/snapshots/11-20210207/gcc-11-20210207.tar.xz

Translated PO files will later be automatically e-mailed to you.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.

Re: [PATCH] calls: Fix a memory leak in maybe_warn_rdwr_sizes [PR99004]

On Tue, 9 Feb 2021, Jakub Jelinek wrote:

> Hi!
> 
> The print_generic_expr_to_str function ends with
> return xstrdup (...); and therefore expects the caller to free
> the argument.
> 
> The following patch does that after it has been copied.
> Instead of doing const_cast to cast away const char * to char *,
> because the code uses s0 and s1 in so few places, I chose just
> to change the types of the two variables so that const_cast
> is not needed.  After all, it is a heap allocated string that
> this function owns and so if it wanted, it could change it too.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Richard.

> 2021-02-09  Jakub Jelinek  
> 
>   PR middle-end/99004
>   * calls.c (maybe_warn_rdwr_sizes): Change s0 and s1 type from
>   const char * to char * and free those pointers after use.
> 
> --- gcc/calls.c.jj2021-01-04 10:25:39.253229068 +0100
> +++ gcc/calls.c   2021-02-08 17:39:10.749551785 +0100
> @@ -2032,7 +2032,7 @@ maybe_warn_rdwr_sizes (rdwr_map *rwm, tr
>tree sizrng[2] = { size_zero_node, build_all_ones_cst (sizetype) };
>if (get_size_range (access_size, sizrng, true))
>   {
> -   const char *s0 = print_generic_expr_to_str (sizrng[0]);
> +   char *s0 = print_generic_expr_to_str (sizrng[0]);
> if (tree_int_cst_equal (sizrng[0], sizrng[1]))
>   {
> gcc_checking_assert (strlen (s0) < sizeof sizstr);
> @@ -2040,11 +2040,13 @@ maybe_warn_rdwr_sizes (rdwr_map *rwm, tr
>   }
> else
>   {
> -   const char *s1 = print_generic_expr_to_str (sizrng[1]);
> +   char *s1 = print_generic_expr_to_str (sizrng[1]);
> gcc_checking_assert (strlen (s0) + strlen (s1)
>  < sizeof sizstr - 4);
> sprintf (sizstr, "[%s, %s]", s0, s1);
> +   free (s1);
>   }
> +   free (s0);
>   }
>else
>   *sizstr = '\0';
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

Re: [Patch] Fortran: %re/%im fixes for OpenMP/OpenACC + gfc_is_simplify_contiguous

2021-02-09 Thread Thomas Schwinge

Hi Tobias!

On 2021-02-08T18:50:25+0100, Tobias Burnus  wrote:
> Found when looking at Julian's 3/4 OpenACC patch, which has not
> yet been committed (and needs still to be revised a bit.)
>
> The fix (a) avoids an ICE when Julian's patch has been applied.
> The patch (b) just makes one error message less confusing.
>
> The testcase shows that only %re/%im run reach the new code as
> %kind/%len are != EXPR_VARIABLE. (The error message is slightly
> misleading if the 'list item'/'var' is not a variable.)
>
>
> (a) We need to handle REF_INQUIRY gfc_is_simplify_contiguous.
>
> That function is used for 'is_contiguous(a(:)%re)', but it works
> without this patch and simplifies already to .false.
> And it is used in openmp.c, which can ICE without this patch.
>
> (b) Just makes the error message nicer - as only EXPR_VARIABLE
> reaches that code, only %re and %im are mentioned in the
> error message.

As so often, I can't really comment on the Fortran language/GCC Fortran
front end details.  ;-|

> (Actually, it is not completely clear whether %re/%im are invalid
> or not; I think they should be – but one can argue otherwise.
> For OpenMP I just saw that it is tacked internally in Issue 2661,
> for OpenACC it is tracked as Issue 346 – but neither has been
> discussed, yet.)

Thanks for filing/locating these discussion items for OpenACC/OpenMP
upstream.  May also put these references into the testcases, so that once
these get addressed, we have something to 'grep' for in GCC?

> PS: '%re'/'%im' permit accessing the real/imaginary part of a
> complex variable as lvalue (in the C++ sense) and also permit
> "var(:)%re = 1.0", which real(z) or imag(z) does not permit.
>
> var%kind == kind(var), but derived types also permit
> parameterized derived types (PDT), which can use '%foo' for respective
> 'len' and 'kind' components.

> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/goacc-gomp/ref_inquiry.f90

Originally, the 'goacc-gomp' testsuites have been meant to be used for
testcases related to OpenACC/OpenMP (non-)interoperability, such as
improper nesting etc. (should be the only/major use -- given that
currently there isn't really any OpenACC/OpenMP interoperability).

This one, I'd have put into separate 'gfortran.dg/goacc/ref_inquiry.f90'
and 'gfortran.dg/gomp/ref_inquiry.f90' (everyone cross-referencing the
other, unless using the same filename makes that obvious enough).

> @@ -0,0 +1,42 @@
> +implicit none
> +type t
> +  integer :: i
> +  character :: c
> +  complex :: z
> +end type t
> +
> +integer :: i
> +character(kind=4, len=5) :: c
> +complex :: z, zz(5)
> +type(t) :: x
> +
> +print *, is_contiguous(zz(:)%re)
> +
> +! EXPR_VARIABLE / Cf. also OpenMP spec issue 2661
> +!$omp target enter data map(to: z%re)! { dg-error "Unexpected 
> complex-parts designator" }
> +!$omp target enter data map(to: z%im)! { dg-error "Unexpected 
> complex-parts designator" }
> +!$omp target enter data map(to: x%z%re)  ! { dg-error "Unexpected 
> complex-parts designator" }
> +!$omp target enter data map(to: x%z%im)  ! { dg-error "Unexpected 
> complex-parts designator" }
> +
> +! Fails differently as it is not a variable (EXPR_VARIABLE)
> +!$omp target enter data map(to: i%kind, c%len) ! { dg-error "not a 
> proper array section" }
> +!$omp target enter data map(to: x%i%kind, x%c%len) ! { dg-error "not a 
> proper array section" }

I note that 'zz' variants (see below) are not being checked for OpenMP.

> +! Likewise for OpenACC
> +!$acc enter data copyin(i%kind, c%len) ! { dg-error "not a proper array 
> section" }
> +!$acc enter data copyin(x%i%kind)  ! { dg-error "not a proper array 
> section" }
> +!$acc enter data copyin(x%c%len)   ! { dg-error "not a proper array 
> section" }
> +!$acc update self(i%kind, c%len)   ! { dg-error "not a proper array 
> section" }
> +!$acc update self(x%i%kind)! { dg-error "not a proper array 
> section" }
> +!$acc update self(x%c%len) ! { dg-error "not a proper array 
> section" }
> +
> +! EXPR_VARIABLE / cf. OpenACC spec issue 346
> +!$acc enter data copyin(z%re)   ! { dg-error "Unexpected complex-parts 
> designator" }
> +!$acc enter data copyin(z%im)   ! { dg-error "Unexpected complex-parts 
> designator" }

Need to check that for 'zz', too?  (See below.)

> +!$acc enter data copyin(x%z%re) ! { dg-error "Unexpected complex-parts 
> designator" }
> +!$acc enter data copyin(x%z%im) ! { dg-error "Unexpected complex-parts 
> designator" }
> +!$acc update self(z%re) ! { dg-error "Unexpected complex-parts 
> designator" }
> +!$acc update self(z%im) ! { dg-error "Unexpected complex-parts 
> designator" }
> +!$acc update self(zz%re)
> +!$acc update self(zz%im)
> +end

And for OpenACC, the 'zz' variants do not emit this error message here.
(That's not immediately obvious to me.)

I can see how data mapping of '[...]%re' etc. are problematic (we're
constructing an "incomplete object"?), but 'update' etc. I'd

[PATCH] tree-optimization/99017 - be more forgiving in BB vect costing

This works around a SLP graph partitioning or cost collecting issue
by being more forgiving in vect_bb_vectorization_profitable_p.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2021-02-09  Richard Biener  

PR tree-optimization/99017
* tree-vect-slp.c (vect_bb_vectorization_profitable_p): Allow
zero vector cost entries.
---
 gcc/tree-vect-slp.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index b9f12c30fb8..ea8a97b01c6 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -4427,7 +4427,8 @@ vect_bb_vectorization_profitable_p (bb_vec_info bb_vinfo,
   /* Now cost the portions individually.  */
   unsigned vi = 0;
   unsigned si = 0;
-  do
+  while (si < li_scalar_costs.length ()
+&& vi < li_vector_costs.length ())
 {
   unsigned sl = li_scalar_costs[si].first;
   unsigned vl = li_vector_costs[vi].first;
@@ -4497,8 +4498,6 @@ vect_bb_vectorization_profitable_p (bb_vec_info bb_vinfo,
  return false;
}
 }
-  while (si < li_scalar_costs.length ()
-&& vi < li_vector_costs.length ());
   if (vi < li_vector_costs.length ())
 {
   if (dump_enabled_p ())
-- 
2.26.2

Re: [PATCH] don't enable DWARF5 by default on Windows (PR98860)

On Tue, Feb 09, 2021 at 07:47:00AM +0100, Richard Biener via Gcc-patches wrote:
> On February 8, 2021 10:44:26 PM GMT+01:00, Mikael Pettersson via Gcc-patches 
>  wrote:
> >PR98860 is a gcc-11 regression where bootstrap fails on Windows since
> >the switch to enable DWARF5 by default. The symptoms are that
> >executables generated by the stage1 compiler fail to run with "Exec
> >format error", which confuses subsequent configure steps and causes
> >hard errors. This happens even with the very latest binutils master.
> >
> >Fixed by updating SUBTARGET_OVERRIDE_OPTIONS to set dwarf_version to 4
> >unless the user explicitly requested another version. I see some other
> >targets did the same.
> >
> >Tested on Cygwin64 and mingw-w64.
> >
> >Ok for master?
> 
> It might be better to expose the default via configure (I've heard desires to 
> control that elsewhere). 

But also it would be better to see some analysis what exactly and why
doesn't work.
What program emits that Exec format error message and why.

Jakub

Re: [PATCH] x86: Always save and restore shadow stack pointer

On Mon, Feb 8, 2021 at 3:07 PM H.J. Lu via Gcc-patches
 wrote:
>
> When the SHSTK feature is not available or not enabled, RDSSP is a NOP,
> always save and restore shadow stack pointer to support compiling source
> codes, containing __builtin_setjmp and __builtin_longjmp, with different
> -fcf-protection options.

Is that an ABI change for -fno-cf-protection?

> PR target/98997
> * config/i386/i386.md (save_stack_nonlocal): Always save shadow
> stack pointer.
> (restore_stack_nonlocal): Always restore shadow stack pointer.
> (@rdssp): Make it unconditional.
> (@incssp): Likewise.
> ---
>  gcc/config/i386/i386.md | 150 ++--
>  1 file changed, 68 insertions(+), 82 deletions(-)
>
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> index b60784a2908..2510bd8a73d 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -19398,21 +19398,16 @@ (define_expand "save_stack_nonlocal"
>  (match_operand 1 "register_operand"))]
>""
>  {
> -  rtx stack_slot;
> -
> -  if (flag_cf_protection & CF_RETURN)
> -{
> -  /* Copy shadow stack pointer to the first slot
> -and stack pointer to the second slot.  */
> -  rtx ssp_slot = adjust_address (operands[0], word_mode, 0);
> -  stack_slot = adjust_address (operands[0], Pmode, UNITS_PER_WORD);
> -
> -  rtx reg_ssp = force_reg (word_mode, const0_rtx);
> -  emit_insn (gen_rdssp (word_mode, reg_ssp, reg_ssp));
> -  emit_move_insn (ssp_slot, reg_ssp);
> -}
> -  else
> -stack_slot = adjust_address (operands[0], Pmode, 0);
> +  /* Copy shadow stack pointer to the first slot and stack pointer to
> + the second slot.  */
> +  rtx ssp_slot = adjust_address (operands[0], word_mode, 0);
> +  rtx stack_slot = adjust_address (operands[0], Pmode, UNITS_PER_WORD);
> +
> +  /* If the SHSTK feature is not available or not enabled, the RDSSP
> + instruction is a NOP and REG_SSP is 0.  */
> +  rtx reg_ssp = force_reg (word_mode, const0_rtx);
> +  emit_insn (gen_rdssp (word_mode, reg_ssp, reg_ssp));
> +  emit_move_insn (ssp_slot, reg_ssp);
>emit_move_insn (stack_slot, operands[1]);
>DONE;
>  })
> @@ -19422,72 +19417,63 @@ (define_expand "restore_stack_nonlocal"
> (match_operand 1 "memory_operand" ""))]
>""
>  {
> -  rtx stack_slot;
> +  /* Restore shadow stack pointer from the first slot and stack pointer
> + from the second slot.  */
> +  rtx ssp_slot = adjust_address (operands[1], word_mode, 0);
> +  rtx stack_slot = adjust_address (operands[1], Pmode, UNITS_PER_WORD);
>
> -  if (flag_cf_protection & CF_RETURN)
> -{
> -  /* Restore shadow stack pointer from the first slot
> -and stack pointer from the second slot.  */
> -  rtx ssp_slot = adjust_address (operands[1], word_mode, 0);
> -  stack_slot = adjust_address (operands[1], Pmode, UNITS_PER_WORD);
> -
> -  /* Get the current shadow stack pointer.  The code below will check if
> -SHSTK feature is enabled.  If it is not enabled the RDSSP instruction
> -is a NOP.  */
> -  rtx reg_ssp = force_reg (word_mode, const0_rtx);
> -  emit_insn (gen_rdssp (word_mode, reg_ssp, reg_ssp));
> -
> -  /* Compare through subtraction the saved and the current ssp
> -to decide if ssp has to be adjusted.  */
> -  reg_ssp = expand_simple_binop (word_mode, MINUS,
> -reg_ssp, ssp_slot,
> -reg_ssp, 1, OPTAB_DIRECT);
> -
> -  /* Compare and jump over adjustment code.  */
> -  rtx noadj_label = gen_label_rtx ();
> -  emit_cmp_and_jump_insns (reg_ssp, const0_rtx, EQ, NULL_RTX,
> -  word_mode, 1, noadj_label);
> -
> -  /* Compute the number of frames to adjust.  */
> -  rtx reg_adj = gen_lowpart (ptr_mode, reg_ssp);
> -  rtx reg_adj_neg = expand_simple_unop (ptr_mode, NEG, reg_adj,
> -   NULL_RTX, 1);
> -
> -  reg_adj = expand_simple_binop (ptr_mode, LSHIFTRT, reg_adj_neg,
> -GEN_INT (exact_log2 (UNITS_PER_WORD)),
> -reg_adj, 1, OPTAB_DIRECT);
> -
> -  /* Check if number of frames <= 255 so no loop is needed.  */
> -  rtx inc_label = gen_label_rtx ();
> -  emit_cmp_and_jump_insns (reg_adj, GEN_INT (255), LEU, NULL_RTX,
> -  ptr_mode, 1, inc_label);
> -
> -  /* Adjust the ssp in a loop.  */
> -  rtx loop_label = gen_label_rtx ();
> -  emit_label (loop_label);
> -  LABEL_NUSES (loop_label) = 1;
> -
> -  rtx reg_255 = force_reg (word_mode, GEN_INT (255));
> -  emit_insn (gen_incssp (word_mode, reg_255));
> -
> -  reg_adj = expand_simple_binop (ptr_mode, MINUS,
> -reg_adj, GEN_INT (255),
> -reg_adj, 1, OPTAB_DIRECT);
> -
> -  /*

Re: [RFC] ldist: Recognize rawmemchr loop patterns

On Mon, Feb 8, 2021 at 3:11 PM Stefan Schulze Frielinghaus via
Gcc-patches  wrote:
>
> This patch adds support for recognizing loops which mimic the behaviour
> of function rawmemchr, and replaces those with an internal function call
> in case a target provides them.  In contrast to the original rawmemchr
> function, this patch also supports different instances where the memory
> pointed to and the pattern are interpreted as 8, 16, and 32 bit sized,
> respectively.
>
> This patch is not final and I'm looking for some feedback:
>
> Previously, only loops which mimic the behaviours of functions memset,
> memcpy, and memmove have been detected and replaced by corresponding
> function calls.  One characteristic of those loops/partitions is that
> they don't have a reduction.  In contrast, loops which mimic the
> behaviour of rawmemchr compute a result and therefore have a reduction.
> My current attempt is to ensure that the reduction statement is not used
> in any other partition and only in that case ignore the reduction and
> replace the loop by a function call.  We then only need to replace the
> reduction variable of the loop which contained the loop result by the
> variable of the lhs of the internal function call.  This should ensure
> that the transformation is correct independently of how partitions are
> fused/distributed in the end.  Any thoughts about this?

Currently we're forcing reduction partitions last (and force to have a single
one by fusing all partitions containing a reduction) because code-generation
does not properly update SSA form for the reduction results.  ISTR that
might be just because we do not copy the LC PHI nodes or do not adjust
them when copying.  That might not be an issue in case you replace the
partition with a call.  I guess you can try to have a testcase with
two rawmemchr patterns and a regular loop part that has to be scheduled
inbetween both for correctness.

> Furthermore, I simply added two new members (pattern, fn) to structure
> builtin_info which I consider rather hacky.  For the long run I thought
> about to split up structure builtin_info into a union where each member
> is a structure for a particular builtin of a partition, i.e., something
> like this:
>
> union builtin_info
> {
>   struct binfo_memset *memset;
>   struct binfo_memcpymove *memcpymove;
>   struct binfo_rawmemchr *rawmemchr;
> };
>
> Such that a structure for one builtin does not get "polluted" by a
> different one.  Any thoughts about this?

Probably makes sense if the list of recognized patterns grow further.

I see you use internal functions rather than builtin functions.  I guess
that's OK.  But you use new target hooks for expansion where I think
new optab entries similar to cmpmem would be more appropriate
where the distinction between 8, 16 or 32 bits can be encoded in
the modes.

Richard.

> Cheers,
> Stefan
> ---
>  gcc/internal-fn.c|  42 ++
>  gcc/internal-fn.def  |   3 +
>  gcc/target-insns.def |   3 +
>  gcc/tree-loop-distribution.c | 257 ++-
>  4 files changed, 272 insertions(+), 33 deletions(-)
>
> diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
> index dd7173126fb..9cd62544a1a 100644
> --- a/gcc/internal-fn.c
> +++ b/gcc/internal-fn.c
> @@ -2917,6 +2917,48 @@ expand_VEC_CONVERT (internal_fn, gcall *)
>gcc_unreachable ();
>  }
>
> +static void
> +expand_RAWMEMCHR8 (internal_fn, gcall *stmt)
> +{
> +  if (targetm.have_rawmemchr8 ())
> +{
> +  rtx result = expand_expr (gimple_call_lhs (stmt), NULL_RTX, VOIDmode, 
> EXPAND_WRITE);
> +  rtx start = expand_normal (gimple_call_arg (stmt, 0));
> +  rtx pattern = expand_normal (gimple_call_arg (stmt, 1));
> +  emit_insn (targetm.gen_rawmemchr8 (result, start, pattern));
> +}
> +  else
> +gcc_unreachable();
> +}
> +
> +static void
> +expand_RAWMEMCHR16 (internal_fn, gcall *stmt)
> +{
> +  if (targetm.have_rawmemchr16 ())
> +{
> +  rtx result = expand_expr (gimple_call_lhs (stmt), NULL_RTX, VOIDmode, 
> EXPAND_WRITE);
> +  rtx start = expand_normal (gimple_call_arg (stmt, 0));
> +  rtx pattern = expand_normal (gimple_call_arg (stmt, 1));
> +  emit_insn (targetm.gen_rawmemchr16 (result, start, pattern));
> +}
> +  else
> +gcc_unreachable();
> +}
> +
> +static void
> +expand_RAWMEMCHR32 (internal_fn, gcall *stmt)
> +{
> +  if (targetm.have_rawmemchr32 ())
> +{
> +  rtx result = expand_expr (gimple_call_lhs (stmt), NULL_RTX, VOIDmode, 
> EXPAND_WRITE);
> +  rtx start = expand_normal (gimple_call_arg (stmt, 0));
> +  rtx pattern = expand_normal (gimple_call_arg (stmt, 1));
> +  emit_insn (targetm.gen_rawmemchr32 (result, start, pattern));
> +}
> +  else
> +gcc_unreachable();
> +}
> +
>  /* Expand the IFN_UNIQUE function according to its first argument.  */
>
>  static void
> diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
> index daeace7a34e..34247859704 100644
> ---

[PATCH] string: Add a workaround for -Wstringop-overread false positives [PR98465]

Hi!

In the PR there are several possibilities how to improve _M_disjunct at
least in certain cases so that the compiler can figure out at least in some
cases where __s is provably disjunct from _M_data() ... _M_data() + this->size()
but it is probably GCC 12 material.

The false positive warning is on this particular copy, which is done for
non-disjunct pointers when __len2 > __len1 and the __s >= __p + __len1,
i.e. __s used to point to the characters moved through _S_move a few lines 
earlier
by __len2 - __len1 characters up to make space.  That is why the
_S_copy source is __s + __len2 - __len1.  Unfortunately, when the compiler
can't prove objects are disjunct, that copying from __s + __len2 - __len1
of __len2 characters can very well mean accessing characters the source
object (if it is not disjunct) provably can't have.

The following patch works around that by making the _S_copy be a __p based
pointer instead of __s based pointer.
__s + __len2 - __len1
and
__p + (__s - __p) + (__len2 - __len1)
have the same value and the latter may seem to be uselessly longer,
but it seems at least currently in GIMPLE we keep it that way and so that is
what the warning code during expansion will see, and only actually
optimize it to __s + __len2 - __len1 during RTL when we lose information
on what is a pointer and what is a mere offset with the same mode.

So, in the end we emit exactly the same assembly, just without the false
positive warning.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-02-09  Jakub Jelinek  

PR middle-end/98465
* include/bits/basic_string.tcc (basic_string::_M_replace): When __s
points to the characters moved by earlier _S_move, compute the source
address using expression based on the __p pointer rather than __s
pointer.

* g++.dg/warn/Wstringop-overread-1.C: New test.

--- libstdc++-v3/include/bits/basic_string.tcc.jj   2021-02-04 
18:15:05.317113098 +0100
+++ libstdc++-v3/include/bits/basic_string.tcc  2021-02-08 19:39:16.864936344 
+0100
@@ -478,7 +478,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  if (__s + __len2 <= __p + __len1)
this->_S_move(__p, __s, __len2);
  else if (__s >= __p + __len1)
-   this->_S_copy(__p, __s + __len2 - __len1, __len2);
+   {
+ const size_type __poff = (__s - __p) + (__len2 - __len1);
+ this->_S_copy(__p, __p + __poff, __len2);
+   }
  else
{
  const size_type __nleft = (__p + __len1) - __s;
--- gcc/testsuite/g++.dg/warn/Wstringop-overread-1.C.jj 2021-02-08 
19:47:15.486646129 +0100
+++ gcc/testsuite/g++.dg/warn/Wstringop-overread-1.C2021-02-08 
19:46:16.118302317 +0100
@@ -0,0 +1,12 @@
+// PR middle-end/98994
+// { dg-do compile }
+// { dg-additional-options "-Wstringop-overread -O2" }
+
+#include 
+
+const char constantString[] = {42, 53};
+
+void f(std::string& s)
+{
+  s.insert(0, static_cast(constantString), 2);
+}

Jakub

Re: [PATCH] gcov: use mmap pools for KVP.

2021-02-09 Thread Martin Liška

PING^2

@Honza: ?

On 1/29/21 2:33 PM, Martin Liška wrote:

PING^1

On 1/25/21 1:51 PM, Martin Liška wrote:

On 1/22/21 3:33 PM, Jan Hubicka wrote:

It is definitly doable (gcov machinery is quite flexible WRT having more
types of counters).

Yes, that would introduce back the dropped TOPN counters which I intentionally
dropped.

We could bring back topn counters or the easier dominating vlaue ones
and add command line option. However perhaps better in long term would
be keep adding mmap ifdefs for targets where it is important...

Kernel guys seems to be getting on profile feedback with clang, so we
should keep in mind posibility of supporting that as well.

Sure, that should not be so difficult. They should handle it in their kernel
run-time.

We could however also ask users to either resort to
-fno-profile-values or implement mmap equivalent ifdefs to libgcov if
they really care about malloc profiling.

Seems reasonable to me.

Well, the current approach makes some assumptions on mmap (and malloc), but
they seem reasonable to me. I don't expect anybody building Mingw Firefox PGO
build,
it's likely unsupported configuration.

It is possible to build Firefox with mingw on windows and I would
expected feedback to work.
https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Build_Instructions/Compiling_Mozilla_With_Mingw
However this is not only about Firefox - we notice problems with Firefox
since it is only real word app where we look at PGO more systematically.
But we definitly should aim for PGO to be useful for apps of similar
complexity. It would be nice to incrase this testing coverage.

I see.

Another observation about the tcmalloc 'malloc' implementation. It hangs in a
PGO build,
but libgcov would be happy if NULL would be returned.

Yep, I would expect other folks to try to PGO optimize malloc
implementations an we could run into variety of issues...

Ok, can we go with the suggested mmap patch for now?

Thanks,
Martin

Honza

Martin

So personally I do not see this as a must feature (and I think Martin
was really looking forward to drop the former topn profile
implementation :)

Another intersting case would be, of course, profiling of kernel...

Honza

[PATCH] calls: Fix a memory leak in maybe_warn_rdwr_sizes [PR99004]

Hi!

The print_generic_expr_to_str function ends with
return xstrdup (...); and therefore expects the caller to free
the argument.

The following patch does that after it has been copied.
Instead of doing const_cast to cast away const char * to char *,
because the code uses s0 and s1 in so few places, I chose just
to change the types of the two variables so that const_cast
is not needed.  After all, it is a heap allocated string that
this function owns and so if it wanted, it could change it too.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-02-09  Jakub Jelinek  

PR middle-end/99004
* calls.c (maybe_warn_rdwr_sizes): Change s0 and s1 type from
const char * to char * and free those pointers after use.

--- gcc/calls.c.jj  2021-01-04 10:25:39.253229068 +0100
+++ gcc/calls.c 2021-02-08 17:39:10.749551785 +0100
@@ -2032,7 +2032,7 @@ maybe_warn_rdwr_sizes (rdwr_map *rwm, tr
   tree sizrng[2] = { size_zero_node, build_all_ones_cst (sizetype) };
   if (get_size_range (access_size, sizrng, true))
{
- const char *s0 = print_generic_expr_to_str (sizrng[0]);
+ char *s0 = print_generic_expr_to_str (sizrng[0]);
  if (tree_int_cst_equal (sizrng[0], sizrng[1]))
{
  gcc_checking_assert (strlen (s0) < sizeof sizstr);
@@ -2040,11 +2040,13 @@ maybe_warn_rdwr_sizes (rdwr_map *rwm, tr
}
  else
{
- const char *s1 = print_generic_expr_to_str (sizrng[1]);
+ char *s1 = print_generic_expr_to_str (sizrng[1]);
  gcc_checking_assert (strlen (s0) + strlen (s1)
   < sizeof sizstr - 4);
  sprintf (sizstr, "[%s, %s]", s0, s1);
+ free (s1);
}
+ free (s0);
}
   else
*sizstr = '\0';

Jakub

[PATCH] c++: Consider addresses of heap artificial vars always non-NULL [PR98988]

Hi!

With -fno-delete-null-pointer-checks which is e.g. implied by
-fsanitize=undefined or default on some embedded targets, the middle-end
folder doesn't consider addresses of global VAR_DECLs to be non-NULL, as one
of them could have address 0.  Still, I think malloc/operator new (at least
the nonthrowing) relies on NULL returns meaning allocation failure rather
than success.  Furthermore, the artificial VAR_DECLs we create for
constexpr new never actually live in the address space of the program,
so we can pretend they will never be NULL too.

The following patch does that, so that one can actually use constexpr
new/delete even with -fno-delete-null-pointer-checks or sanitizers
- otherwise delete itself tests whether the passed address is non-NULL and
that wouldn't fold to true if the pointer has been allocated with constexpr
new.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-02-09  Jakub Jelinek  

PR c++/98988
* constexpr.c (cxx_eval_binary_expression): Fold equality comparison
of address of heap artificial VAR_DECL against nullptr.

* g++.dg/cpp2a/constexpr-new16.C: New test.

--- gcc/cp/constexpr.c.jj   2021-02-02 09:52:10.535234056 +0100
+++ gcc/cp/constexpr.c  2021-02-08 17:24:28.774467744 +0100
@@ -3157,6 +3157,32 @@ cxx_eval_binary_expression (const conste
lhs = cplus_expand_constant (lhs);
   else if (TREE_CODE (rhs) == PTRMEM_CST)
rhs = cplus_expand_constant (rhs);
+  if (r == NULL_TREE && POINTER_TYPE_P (TREE_TYPE (lhs)))
+   {
+ tree op0 = lhs;
+ tree op1 = rhs;
+ if (integer_zerop (op0))
+   std::swap (op0, op1);
+ /* With -fno-delete-null-pointer-checks, generic folders
+assume addresses of non-automatic VAR_DECLs could be NULL.
+For constexpr new, as the vars aren't really living in
+target memory, assume they will never be NULL.  */
+ if (integer_zerop (op1))
+   {
+ STRIP_NOPS (op0);
+ if (TREE_CODE (op0) == ADDR_EXPR)
+   {
+ op0 = get_base_address (TREE_OPERAND (op0, 0));
+ if (op0
+ && VAR_P (op0)
+ && (DECL_NAME (op0) == heap_identifier
+ || DECL_NAME (op0) == heap_uninit_identifier
+ || DECL_NAME (op0) == heap_vec_identifier
+ || DECL_NAME (op0) == heap_vec_uninit_identifier))
+   r = constant_boolean_node (!is_code_eq, type);
+   }
+   }
+   }
 }
   if (code == POINTER_PLUS_EXPR && !*non_constant_p
   && integer_zerop (lhs) && !integer_zerop (rhs))
--- gcc/testsuite/g++.dg/cpp2a/constexpr-new16.C.jj 2021-02-08 
17:25:37.130699213 +0100
+++ gcc/testsuite/g++.dg/cpp2a/constexpr-new16.C2021-02-08 
17:26:49.175889198 +0100
@@ -0,0 +1,13 @@
+// PR c++/98988
+// { dg-do compile { target c++20 } }
+// { dg-options "-fno-delete-null-pointer-checks" }
+
+constexpr bool
+foo ()
+{
+  auto ptr = new int();
+  delete ptr;
+  return true;
+}
+
+static_assert (foo ());

Jakub

Re: PR 96391? Can we fix it for gcc11?