[Bug tree-optimization/83247] simplify (int)a_long < 0 when we know a_long fits in int

2023-08-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83247

--- Comment #5 from Andrew Pinski  ---
Some failures ...

gnat.dg/opt86a.adb: pattern found 1 times
FAIL: gnat.dg/opt86a.adb scan-tree-dump-times optimized ">>" 4



+FAIL: gnat.dg/opt86b.adb scan-tree-dump-not optimized "> 13"
+FAIL: gnat.dg/opt86c.adb scan-tree-dump-not optimized "> 26"
+FAIL: gnat.dg/opt86c.adb scan-tree-dump-not optimized "> 29"

[PATCH 2/2] Ada: Finalization of constrained subtypes of unconstrained synchronized private extensions

2023-08-09 Thread Richard Wai
When generating TSS address finalization bodies for a tagged class-wide
subtype, GNAT climbs the parent chain looking for the first
"non-constrained" type. That type's underlying type's class-wide type is
used as a "designated" type for a dispatching TSS deep finalize call to the
designated class-wide type. In the case of a constrained subtype of an
unconstrained synchronized private extension, this ends up designating the
underlying type of that private extension. This means it targets the
class-wide type of the actual underlying concurrent type rather than the
corresponding record. Ultimately it ends up generating a call to the
corresponding record's deep finalizer, but with incompatible types
(concurrent_type'Class -> concurrent_typeV'Class). This causes compilation
to fail.

 

This patch adds extra logic to exp_ch7(Make_Finalize_Address_Stmts) to
identify such cases and ensure that the designated type is the corresponding
record type's class-wide type in that situation.

 

Patch file is attached.

 

--  Begin change log entry -

 

ada: TSS finalize address subprogram generation for constrained subtypes of
unconstrained synchronized private extensions should take care to designate
the corresponding record of the underlying concurrent type.

 

When generating TSS finalize address subprograms for class-wide types of
constrained root types, it follows the parent chain looking for the first
"non-constrained" type. It is possible that such a type is a private
extension with the "synchronized" keyword, in which case the underlying type
is a concurrent type. When that happens, the designated type of the finalize
address subprogram should be the corresponding record's class-wide-type.

 

Gcc/ada/

* exp_ch3(Expand_Freeze_Class_Wide_Type): Expanded comments
explaining why TSS Finalize_Address is not generated for concurrent
class-wide types.

* exp_ch7(Make_Finalize_Address_Stmts): Handle cases where
the underlying non-constrained parent type is a concurrent type, and adjust
the designated type to be the corresponding record's class-wide type.

 

--  End change log entry -

 

This patch was bootstrapped on x86_64-*-freebsd13.2. One new test cases was
added. Note that 4 gnat test cases fail currently on master and are
unrelated to this patch.

 

Check-ada output of this patch:

 

=== acats tests ===

Running chapter a ...

Running chapter c2 ...

Running chapter c3 ...

Running chapter c4 ...

Running chapter c5 ...

Running chapter c6 ...

Running chapter c7 ...

Running chapter c8 ...

Running chapter c9 ...

Running chapter ca ...

Running chapter cb ...

Running chapter cc ...

Running chapter cd ...

Running chapter ce ...

Running chapter cxa ...

Running chapter cxb ...

Running chapter cxf ...

Running chapter cxg ...

Running chapter cxh ...

Running chapter cz ...

Running chapter d ...

Running chapter e ...

Running chapter l ...

=== acats Summary ===

# of expected passes   2328

# of unexpected failures 0

 

Native configuration is x86_64-unknown-freebsd13.2

 

=== gnat tests ===

 

Schedule of variations:

unix

 

Running target unix

FAIL: gnat.dg/specs/alignment2.ads  (test for warnings, line 14)

FAIL: gnat.dg/specs/alignment2.ads  (test for warnings, line 20)

FAIL: gnat.dg/specs/alignment2.ads  (test for warnings, line 38)

FAIL: gnat.dg/specs/alignment2.ads  (test for warnings, line 42)

 

=== gnat Summary ===

 

# of expected passes   3401

# of unexpected failures 4

# of expected failures  23

# of unsupported tests   10

gnatmake version 14.0.0 20230809 (experimental)

 

 

Richard Wai

ANNEXI-STRAYLINE



ada-tss-constrained-subtype-of-private-synchronized-extention.patch
Description: Binary data


[PATCH] sso-string@gnu-versioned-namespace [PR83077]

2023-08-09 Thread François Dumont via Gcc-patches

Hi

I've eventually completed this work.

This evolution will allow to build libstdc++ without dual abi and using 
cxx11 abi. For the moment such a config is only accessible through the 
--enable-symvers=gnu-versioned-namespace configuration.


    libstdc++: [_GLIBCXX_INLINE_VERSION] Use cxx11 abi [PR83077]

    Use cxx11 abi when activating versioned namespace mode.

    libstdcxx-v3/ChangeLog:

    PR libstdc++/83077
    * acinclude.m4 [GLIBCXX_ENABLE_LIBSTDCXX_DUAL_ABI]: Default 
to "new" libstdcxx abi.
    * config/locale/dragonfly/monetary_members.cc 
[!_GLIBCXX_USE_DUAL_ABI]: Define money_base

    members.
    * config/locale/generic/monetary_members.cc 
[!_GLIBCXX_USE_DUAL_ABI]: Likewise.
    * config/locale/gnu/monetary_members.cc 
[!_GLIBCXX_USE_DUAL_ABI]: Likewise.

    * config/locale/gnu/numeric_members.cc
    [!_GLIBCXX_USE_DUAL_ABI](__narrow_multibyte_chars): Define.
    * configure: Regenerate.
    * include/bits/c++config
    [_GLIBCXX_INLINE_VERSION](_GLIBCXX_NAMESPACE_CXX11, 
_GLIBCXX_BEGIN_NAMESPACE_CXX11):

    Define empty.
[_GLIBCXX_INLINE_VERSION](_GLIBCXX_END_NAMESPACE_CXX11, 
_GLIBCXX_DEFAULT_ABI_TAG):

    Likewise.
    * include/bits/cow_string.h [!_GLIBCXX_USE_CXX11_ABI]: 
Define a light version of COW

    basic_string as __std_cow_string for use in stdexcept.
    * include/std/stdexcept [_GLIBCXX_USE_CXX11_ABI]: Define 
__cow_string.

    (__cow_string(const char*)): New.
    (__cow_string::c_str()): New.
    * python/libstdcxx/v6/printers.py 
(StdStringPrinter::__init__): Set self.new_string to True

    when std::__8::basic_string type is found.
    * src/Makefile.am 
[ENABLE_SYMVERS_GNU_NAMESPACE](ldbl_alt128_compat_sources): Define empty.

    * src/Makefile.in: Regenerate.
    * src/c++11/Makefile.am (cxx11_abi_sources): Rename into...
    (dual_abi_sources): ...this. Also move cow-local_init.cc, 
cxx11-hash_tr1.cc,

    cxx11-ios_failure.cc entries to...
    (sources): ...this.
    (extra_string_inst_sources): Move cow-fstream-inst.cc, 
cow-sstream-inst.cc, cow-string-inst.cc,
    cow-string-io-inst.cc, cow-wtring-inst.cc, 
cow-wstring-io-inst.cc, cxx11-locale-inst.cc,

    cxx11-wlocale-inst.cc entries to...
    (inst_sources): ...this.
    * src/c++11/Makefile.in: Regenerate.
    * src/c++11/cow-fstream-inst.cc [_GLIBCXX_USE_CXX11_ABI]: 
Skip definitions.
    * src/c++11/cow-locale_init.cc [_GLIBCXX_USE_CXX11_ABI]: 
Skip definitions.
    * src/c++11/cow-sstream-inst.cc [_GLIBCXX_USE_CXX11_ABI]: 
Skip definitions.
    * src/c++11/cow-stdexcept.cc [_GLIBCXX_USE_CXX11_ABI]: 
Include .
    [_GLIBCXX_USE_DUAL_ABI || 
_GLIBCXX_USE_CXX11_ABI](__cow_string): Redefine before including
    . Define 
_GLIBCXX_DEFINE_STDEXCEPT_INSTANTIATIONS so that __cow_string definition

    in  is skipped.
    [_GLIBCXX_USE_CXX11_ABI]: Skip Transaction Memory TS 
definitions.
    * src/c++11/cow-string-inst.cc [_GLIBCXX_USE_CXX11_ABI]: 
Skip definitions.
    * src/c++11/cow-string-io-inst.cc [_GLIBCXX_USE_CXX11_ABI]: 
Skip definitions.
    * src/c++11/cow-wstring-inst.cc [_GLIBCXX_USE_CXX11_ABI]: 
Skip definitions.
    * src/c++11/cow-wstring-io-inst.cc 
[_GLIBCXX_USE_CXX11_ABI]: Skip definitions.
    * src/c++11/cxx11-hash_tr1.cc [!_GLIBCXX_USE_CXX11_ABI]: 
Skip definitions.
    * src/c++11/cxx11-ios_failure.cc [!_GLIBCXX_USE_CXX11_ABI]: 
Skip definitions.

    [!_GLIBCXX_USE_DUAL_ABI] (__ios_failure): Remove.
    * src/c++11/cxx11-locale-inst.cc: Cleanup, just include 
locale-inst.cc.
    * src/c++11/cxx11-stdexcept.cc [!_GLIBCXX_USE_CXX11_ABI]: 
Skip definitions.
    * src/c++11/cxx11-wlocale-inst.cc 
[!_GLIBCXX_USE_CXX11_ABI]: Skip definitions.

    * src/c++11/locale-inst-numeric.h
[!_GLIBCXX_USE_DUAL_ABI](std::use_facet>, 
std::use_facet>): Instantiate.
[!_GLIBCXX_USE_DUAL_ABI](std::has_facet>, 
std::has_facet>): Instantiate.
    [!_GLIBCXX_USE_DUAL_ABI](std::num_getistreambuf_iterator>): Instantiate.
    [!_GLIBCXX_USE_DUAL_ABI](std::num_putostreambuf_iterator>): Instantiate.
    * src/c++11/locale-inst.cc [!_GLIBCXX_USE_DUAL_ABI]: Build 
only when configured

    _GLIBCXX_USE_CXX11_ABI is equal to currently built abi.
    [!_GLIBCXX_USE_DUAL_ABI](__moneypunct_cache): 
Instantiate.
    [!_GLIBCXX_USE_DUAL_ABI](__moneypunct_cache): 
Instantiate.

    [!_GLIBCXX_USE_DUAL_ABI](__numpunct_cache): Instantiate.
    [!_GLIBCXX_USE_DUAL_ABI](__timepunct): Instantiate.
    [!_GLIBCXX_USE_DUAL_ABI](__timepunct_cache): Instantiate.
    [!_GLIBCXX_USE_DUAL_ABI](time_putostreambuf_iterator>): Instantiate.
    

[PATCH v1] RISC-V: Support RVV VFMACC rounding mode intrinsic API

2023-08-09 Thread Pan Li via Gcc-patches
From: Pan Li 

This patch would like to support the rounding mode API for the
VFMACC for the below samples.

* __riscv_vfmacc_vv_f32m1_rm
* __riscv_vfmacc_vv_f32m1_rm_m
* __riscv_vfmacc_vf_f32m1_rm
* __riscv_vfmacc_vf_f32m1_rm_m

Signed-off-by: Pan Li 

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc
(class vfmacc_frm): New class for vfmacc frm.
(vfmacc_frm_obj): New declaration.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def
(vfmacc_frm): New function definition.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-single-macc.c: New test.
---
 .../riscv/riscv-vector-builtins-bases.cc  | 25 ++
 .../riscv/riscv-vector-builtins-bases.h   |  1 +
 .../riscv/riscv-vector-builtins-functions.def |  3 ++
 .../riscv/rvv/base/float-point-macc.c | 47 +++
 4 files changed, 76 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/float-point-macc.c

diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index afe3735f5ee..1695d77e8bd 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -356,6 +356,29 @@ public:
   }
 };
 
+/* Implements below instructions for frm
+   - vfmacc
+*/
+class vfmacc_frm : public function_base
+{
+public:
+  bool has_rounding_mode_operand_p () const override { return true; }
+
+  bool has_merge_operand_p () const override { return false; }
+
+  rtx expand (function_expander ) const override
+  {
+if (e.op_info->op == OP_TYPE_vf)
+  return e.use_ternop_insn (true,
+   code_for_pred_mul_scalar (PLUS,
+ e.vector_mode ()));
+if (e.op_info->op == OP_TYPE_vv)
+  return e.use_ternop_insn (true,
+   code_for_pred_mul (PLUS, e.vector_mode ()));
+gcc_unreachable ();
+  }
+};
+
 /* Implements vrsub.  */
 class vrsub : public function_base
 {
@@ -2116,6 +2139,7 @@ static CONSTEXPR const reverse_binop_frm 
vfrdiv_frm_obj;
 static CONSTEXPR const widen_binop vfwmul_obj;
 static CONSTEXPR const widen_binop_frm vfwmul_frm_obj;
 static CONSTEXPR const vfmacc vfmacc_obj;
+static CONSTEXPR const vfmacc_frm vfmacc_frm_obj;
 static CONSTEXPR const vfnmsac vfnmsac_obj;
 static CONSTEXPR const vfmadd vfmadd_obj;
 static CONSTEXPR const vfnmsub vfnmsub_obj;
@@ -2351,6 +2375,7 @@ BASE (vfrdiv_frm)
 BASE (vfwmul)
 BASE (vfwmul_frm)
 BASE (vfmacc)
+BASE (vfmacc_frm)
 BASE (vfnmsac)
 BASE (vfmadd)
 BASE (vfnmsub)
diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.h 
b/gcc/config/riscv/riscv-vector-builtins-bases.h
index 2d2b52a312c..67d18412b4c 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.h
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.h
@@ -160,6 +160,7 @@ extern const function_base *const vfrdiv_frm;
 extern const function_base *const vfwmul;
 extern const function_base *const vfwmul_frm;
 extern const function_base *const vfmacc;
+extern const function_base *const vfmacc_frm;
 extern const function_base *const vfnmsac;
 extern const function_base *const vfmadd;
 extern const function_base *const vfnmsub;
diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def 
b/gcc/config/riscv/riscv-vector-builtins-functions.def
index d43b33ded17..92ecf8a9065 100644
--- a/gcc/config/riscv/riscv-vector-builtins-functions.def
+++ b/gcc/config/riscv/riscv-vector-builtins-functions.def
@@ -349,6 +349,9 @@ DEF_RVV_FUNCTION (vfnmadd, alu, full_preds, f_vvfv_ops)
 DEF_RVV_FUNCTION (vfmsub, alu, full_preds, f__ops)
 DEF_RVV_FUNCTION (vfmsub, alu, full_preds, f_vvfv_ops)
 
+DEF_RVV_FUNCTION (vfmacc_frm, alu_frm, full_preds, f__ops)
+DEF_RVV_FUNCTION (vfmacc_frm, alu_frm, full_preds, f_vvfv_ops)
+
 // 13.7. Vector Widening Floating-Point Fused Multiply-Add Instructions
 DEF_RVV_FUNCTION (vfwmacc, alu, full_preds, f_wwvv_ops)
 DEF_RVV_FUNCTION (vfwmacc, alu, full_preds, f_wwfv_ops)
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-macc.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-macc.c
new file mode 100644
index 000..df29f4d240f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-macc.c
@@ -0,0 +1,47 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64 -O3 -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+typedef float float32_t;
+
+vfloat32m1_t
+test_riscv_vfmacc_vv_f32m1_rm (vfloat32m1_t vd, vfloat32m1_t op1,
+  vfloat32m1_t op2, size_t vl) {
+  return __riscv_vfmacc_vv_f32m1_rm (vd, op1, op2, 0, vl);
+}
+
+vfloat32m1_t
+test_vfmacc_vv_f32m1_rm_m (vbool32_t mask, vfloat32m1_t vd, vfloat32m1_t op1,
+  vfloat32m1_t op2, size_t vl) {
+  return __riscv_vfmacc_vv_f32m1_rm_m (mask, vd, op1, op2, 1, vl);
+}
+

[PATCH 1/2] Ada: Synchronized private extensions are always limited

2023-08-09 Thread Richard Wai
GNAT currently considers a synchronized private extension that derives from
an interface to be limited only when said interface is a concurrent
interface. However it is of course legal for a synchronized private
extension to derive from a limited interface. In this case GNAT fails to
correctly determine that the private extension is limited.

 

This causes two separate problems that makes discriminated types in such a
case impossible:

1.  GNAT inappropriately rejects compilation, claiming default
discriminants on such a private extension are illegal.
2.  GNAT fails to generate the expected discriminals for the
unconstrained discriminanted case, leading to the corresponding
discriminants of the "corresponding record" of the underlying concurrent
type to have no identifiers, and thus compilation fails.

 

Fairly simple fix. If "synchronized" appears in the private extension
declaration, it is limited. This is explicit in the RM as well (7.3(6/2)).

 

Fixing this bug uncovered of a related bug wrt. TSS address finalizer
generation for constrained subtypes of synchronized private extensions with
no default discriminants. That patch is to follow separately.

 

Patch file is attached.

 

--  Begin change log entry --

 

ada: Private extensions with the keyword "synchronized" are always limited.

 

GNAT was relying on synchronized private type extensions deriving from a
concurrent interface to determine its limitedness. This does not cover the
case where such an extension derives a limited interface. RM-7.6(6/2) makes
is clear that "synchronized" in a private extension implies the derived type
is limited. GNAT should explicitly check for the presence of "synchronized"
in a private extension declaration, and it should have the same effect as
the presence of "limited".

 

gcc/ada/

* sem_ch3.adb (Build_Derived_Record_Type): Treat presence of
keyword "synchronized" the same as "limited" when determining if a private
extension is limited.

 

-- End change log entry --

 

This patch was bootstrapped on x86_64-*-freebsd13.2. Two new test cases were
added. Note that 4 gnat test cases fail currently on master and are
unrelated to this patch.

 

Check-ada output of this patch:

=== acats tests ===

Running chapter a ...

Running chapter c2 ...

Running chapter c3 ...

Running chapter c4 ...

Running chapter c5 ...

Running chapter c6 ...

Running chapter c7 ...

Running chapter c8 ...

Running chapter c9 ...

Running chapter ca ...

Running chapter cb ...

Running chapter cc ...

Running chapter cd ...

Running chapter ce ...

Running chapter cxa ...

Running chapter cxb ...

Running chapter cxf ...

Running chapter cxg ...

Running chapter cxh ...

Running chapter cz ...

Running chapter d ...

Running chapter e ...

Running chapter l ...

=== acats Summary ===

# of expected passes   2328

# of unexpected failures 0

 

Native configuration is x86_64-unknown-freebsd13.2

 

=== gnat tests ===

 

Schedule of variations:

unix

 

Running target unix

FAIL: gnat.dg/specs/alignment2.ads  (test for warnings, line 14)

FAIL: gnat.dg/specs/alignment2.ads  (test for warnings, line 20)

FAIL: gnat.dg/specs/alignment2.ads  (test for warnings, line 38)

FAIL: gnat.dg/specs/alignment2.ads  (test for warnings, line 42)

 

=== gnat Summary ===

 

# of expected passes   3402

# of unexpected failures4

# of expected failures      23

# of unsupported tests   10

gnatmake version 14.0.0 20230809 (experimental)

 

 

Richard Wai

ANNEXI-STRAYLINE



ada-synchronized-private-types-are-limited.patch
Description: Binary data


RE: [PATCH v3] RISC-V: Refactor RVV frm_mode attr for rounding mode intrinsic

2023-08-09 Thread Li, Pan2 via Gcc-patches
Committed with some cleanup in change log, thanks Kito.

Pan

From: Kito Cheng 
Sent: Thursday, August 10, 2023 11:14 AM
To: Li, Pan2 
Cc: GCC Patches ; 钟居哲 ; Jeff Law 
; Wang, Yanzhang ; Kito Cheng 

Subject: Re: [PATCH v3] RISC-V: Refactor RVV frm_mode attr for rounding mode 
intrinsic

LGTM

mailto:pan2...@intel.com>> 於 2023年8月10日 週四 11:12 寫道:
From: Pan Li mailto:pan2...@intel.com>>

The frm_mode attr has some assumptions for each define insn as below.

1. The define insn has at least 9 operands.
2. The operands[9] must be frm reg.
3. The operands[9] must be const int.

Actually, the frm operand can be operands[8], operands[9] or
operands[10], and not all the define insn has frm operands.

This patch would like to refactor frm and eliminate the above
assumptions, as well as unblock the underlying rounding mode intrinsic
API support.

After refactor, the default frm will be none, and the selected insn type
will be dyn. For the floating point which honors the frm, we will
set the frm_mode attr explicitly in define_insn.

Signed-off-by: Pan Li mailto:pan2...@intel.com>>
Co-Authored-by: Kito Cheng mailto:kito.ch...@sifive.com>>

gcc/ChangeLog:

* config/riscv/riscv-protos.h (get_frm_mode): Remove operand
assumptions.
* config/riscv/riscv-v.cc (get_frm_mode): New function.
* config/riscv/riscv-vector-builtins.cc
(function_expander::use_ternop_insn):
* config/riscv/vector.md: Set frm_mode attr explicitly.

gcc/ChangeLog:

* config/riscv/riscv-protos.h
(enum floating_point_rounding_mode): Add NONE, DYN_EXIT and DYN_CALL.
(get_frm_mode): New declaration.
* config/riscv/riscv-v.cc (get_frm_mode): New function to get frm mode.
* config/riscv/riscv-vector-builtins.cc
(function_expander::use_ternop_insn): Take care of frm reg.
* config/riscv/riscv.cc (riscv_static_frm_mode_p): Migrate to FRM_XXX.
(riscv_emit_frm_mode_set): Ditto.
(riscv_emit_mode_set): Ditto.
(riscv_frm_adjust_mode_after_call): Ditto.
(riscv_frm_mode_needed): Ditto.
(riscv_frm_mode_after): Ditto.
(riscv_mode_entry): Ditto.
(riscv_mode_exit): Ditto.
* config/riscv/riscv.h (NUM_MODES_FOR_MODE_SWITCHING): Ditto.
* config/riscv/vector.md
(rne,rtz,rdn,rup,rmm,dyn,dyn_exit,dyn_call,none): Removed
(symbol_ref): * config/riscv/vector.md: Set frm_mode attr explicitly.
---
 gcc/config/riscv/riscv-protos.h   |   4 +
 gcc/config/riscv/riscv-v.cc   |  29 
 gcc/config/riscv/riscv-vector-builtins.cc |  22 ++-
 gcc/config/riscv/riscv.cc |  46 +++---
 gcc/config/riscv/riscv.h  |   2 +-
 gcc/config/riscv/vector.md| 172 ++
 6 files changed, 186 insertions(+), 89 deletions(-)

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index ac9c4b02f17..2fbed04ff84 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -349,8 +349,12 @@ enum floating_point_rounding_mode
   FRM_DYN = 7, /* Aka 0b111.  */
   FRM_STATIC_MIN = FRM_RNE,
   FRM_STATIC_MAX = FRM_RMM,
+  FRM_DYN_EXIT = 8,
+  FRM_DYN_CALL = 9,
+  FRM_NONE = 10,
 };

+enum floating_point_rounding_mode get_frm_mode (rtx);
 opt_machine_mode vectorize_related_mode (machine_mode, scalar_mode,
 poly_uint64);
 unsigned int autovectorize_vector_modes (vec *, bool);
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 0bea04c1967..c9f0a4a9e7b 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -112,6 +112,7 @@ public:
   {
 m_has_fp_rounding_mode_p = true;
 m_fp_rounding_mode = mode;
+gcc_assert (mode <= FRM_DYN);
   }

   void add_output_operand (rtx x, machine_mode mode)
@@ -1590,6 +1591,34 @@ expand_const_vector (rtx target, rtx src)
 gcc_unreachable ();
 }

+/* Get the frm mode with given CONST_INT rtx, the default mode is
+   FRM_DYN.  */
+enum floating_point_rounding_mode
+get_frm_mode (rtx operand)
+{
+  gcc_assert (CONST_INT_P (operand));
+
+  switch (INTVAL (operand))
+{
+case FRM_RNE:
+  return FRM_RNE;
+case FRM_RTZ:
+  return FRM_RTZ;
+case FRM_RDN:
+  return FRM_RDN;
+case FRM_RUP:
+  return FRM_RUP;
+case FRM_RMM:
+  return FRM_RMM;
+case FRM_DYN:
+  return FRM_DYN;
+default:
+  gcc_unreachable ();
+}
+
+  gcc_unreachable ();
+}
+
 /* Expand a pre-RA RVV data move from SRC to DEST.
It expands move for RVV fractional vector modes.  */
 bool
diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
b/gcc/config/riscv/riscv-vector-builtins.cc
index 528dca7ae85..abab06c00ed 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -3730,17 +3730,29 @@ function_expander::use_ternop_insn (bool vd_accum_p, 
insn_code icode)
 }

   for (int argno = arg_offset; 

RE: Re: [PATCH] RISC-V: Fix VLMAX AVL incorrect local anticipate [VSETVL PASS]

2023-08-09 Thread Li, Pan2 via Gcc-patches
Thanks Jeff. Ported to gcc-13 with minor changes to test cases.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of juzhe.zh...@rivai.ai
Sent: Thursday, August 10, 2023 8:50 AM
To: jeffreyalaw ; gcc-patches 
Cc: kito.cheng ; Kito.cheng ; 
Robin Dapp 
Subject: Re: Re: [PATCH] RISC-V: Fix VLMAX AVL incorrect local anticipate 
[VSETVL PASS]

Yes. I think so. Will backport GCC 13 soon.



juzhe.zh...@rivai.ai
 
From: Jeff Law
Date: 2023-08-10 01:01
To: Juzhe-Zhong; gcc-patches
CC: kito.cheng; kito.cheng; rdapp.gcc
Subject: Re: [PATCH] RISC-V: Fix VLMAX AVL incorrect local anticipate [VSETVL 
PASS]
 
 
On 8/9/23 04:51, Juzhe-Zhong wrote:
> Realize we have a bug in VSETVL PASS which is triggered by 
> strided_load_run-1.c in RV32 system.
> 
> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c 
> execution test
> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c 
> execution test
> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c 
> execution test
> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c 
> execution test
> 
> This is because VSETVL PASS incorrect hoist vsetvl instruction:
> 
> ...
> 10156: 0d9075d7  vsetvli a1,zero,e64,m2,ta,ma ---> pollute 'a1' 
> register which will be used by following insns.
> 1015a: 01d586b3  add a3,a1,t4  > use 'a1'
> 1015e: 5e070257  vmv.v.v v4,v14
> 10162: b7032257  vmacc.vv v4,v6,v16
> 10166: 26440257  vand.vv v4,v4,v8
> 1016a: 22880227  vs2r.v v4,(a6)
> 1016e: 00b6b7b3  sltu a5,a3,a1
> 10172: 22888227  vs2r.v v4,(a7)
> 10176: 9e60b157  vmv2r.v v2,v6
> 1017a: 97baadd a5,a5,a4
> 1017c: a6a62157  vmadd.vv v2,v12,v10
> 10180: 26240157  vand.vv v2,v2,v8
> 10184: 22830127  vs2r.v v2,(t1)
> 10188: 873emv a4,a5
> 1018a: 982aadd a6,a6,a0
> 1018c: 98aaadd a7,a7,a0
> 1018e: 932aadd t1,t1,a0
> 10190: 85b6mv a1,a3   -> set 'a1'
> ...
> 
> gcc/ChangeLog:
> 
>  * config/riscv/riscv-vsetvl.cc (anticipatable_occurrence_p): Fix 
> incorrect anticipate info.
> 
> gcc/testsuite/ChangeLog:
> 
>  * gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c: 
> Adapt test.
>  * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-24.c: Ditto.
>  * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-25.c: Ditto.
>  * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-26.c: Ditto.
>  * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-36.c: Ditto.
>  * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-14.c: Ditto.
>  * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-15.c: Ditto.
OK.
 
Do we need to backport this to gcc-13?
 
jeff
 


Re: [PATCH v3] RISC-V: Refactor RVV frm_mode attr for rounding mode intrinsic

2023-08-09 Thread Kito Cheng via Gcc-patches
LGTM

 於 2023年8月10日 週四 11:12 寫道:

> From: Pan Li 
>
> The frm_mode attr has some assumptions for each define insn as below.
>
> 1. The define insn has at least 9 operands.
> 2. The operands[9] must be frm reg.
> 3. The operands[9] must be const int.
>
> Actually, the frm operand can be operands[8], operands[9] or
> operands[10], and not all the define insn has frm operands.
>
> This patch would like to refactor frm and eliminate the above
> assumptions, as well as unblock the underlying rounding mode intrinsic
> API support.
>
> After refactor, the default frm will be none, and the selected insn type
> will be dyn. For the floating point which honors the frm, we will
> set the frm_mode attr explicitly in define_insn.
>
> Signed-off-by: Pan Li 
> Co-Authored-by: Kito Cheng 
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-protos.h (get_frm_mode): Remove operand
> assumptions.
> * config/riscv/riscv-v.cc (get_frm_mode): New function.
> * config/riscv/riscv-vector-builtins.cc
> (function_expander::use_ternop_insn):
> * config/riscv/vector.md: Set frm_mode attr explicitly.
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-protos.h
> (enum floating_point_rounding_mode): Add NONE, DYN_EXIT and
> DYN_CALL.
> (get_frm_mode): New declaration.
> * config/riscv/riscv-v.cc (get_frm_mode): New function to get frm
> mode.
> * config/riscv/riscv-vector-builtins.cc
> (function_expander::use_ternop_insn): Take care of frm reg.
> * config/riscv/riscv.cc (riscv_static_frm_mode_p): Migrate to
> FRM_XXX.
> (riscv_emit_frm_mode_set): Ditto.
> (riscv_emit_mode_set): Ditto.
> (riscv_frm_adjust_mode_after_call): Ditto.
> (riscv_frm_mode_needed): Ditto.
> (riscv_frm_mode_after): Ditto.
> (riscv_mode_entry): Ditto.
> (riscv_mode_exit): Ditto.
> * config/riscv/riscv.h (NUM_MODES_FOR_MODE_SWITCHING): Ditto.
> * config/riscv/vector.md
> (rne,rtz,rdn,rup,rmm,dyn,dyn_exit,dyn_call,none): Removed
> (symbol_ref): * config/riscv/vector.md: Set frm_mode attr
> explicitly.
> ---
>  gcc/config/riscv/riscv-protos.h   |   4 +
>  gcc/config/riscv/riscv-v.cc   |  29 
>  gcc/config/riscv/riscv-vector-builtins.cc |  22 ++-
>  gcc/config/riscv/riscv.cc |  46 +++---
>  gcc/config/riscv/riscv.h  |   2 +-
>  gcc/config/riscv/vector.md| 172 ++
>  6 files changed, 186 insertions(+), 89 deletions(-)
>
> diff --git a/gcc/config/riscv/riscv-protos.h
> b/gcc/config/riscv/riscv-protos.h
> index ac9c4b02f17..2fbed04ff84 100644
> --- a/gcc/config/riscv/riscv-protos.h
> +++ b/gcc/config/riscv/riscv-protos.h
> @@ -349,8 +349,12 @@ enum floating_point_rounding_mode
>FRM_DYN = 7, /* Aka 0b111.  */
>FRM_STATIC_MIN = FRM_RNE,
>FRM_STATIC_MAX = FRM_RMM,
> +  FRM_DYN_EXIT = 8,
> +  FRM_DYN_CALL = 9,
> +  FRM_NONE = 10,
>  };
>
> +enum floating_point_rounding_mode get_frm_mode (rtx);
>  opt_machine_mode vectorize_related_mode (machine_mode, scalar_mode,
>  poly_uint64);
>  unsigned int autovectorize_vector_modes (vec *, bool);
> diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
> index 0bea04c1967..c9f0a4a9e7b 100644
> --- a/gcc/config/riscv/riscv-v.cc
> +++ b/gcc/config/riscv/riscv-v.cc
> @@ -112,6 +112,7 @@ public:
>{
>  m_has_fp_rounding_mode_p = true;
>  m_fp_rounding_mode = mode;
> +gcc_assert (mode <= FRM_DYN);
>}
>
>void add_output_operand (rtx x, machine_mode mode)
> @@ -1590,6 +1591,34 @@ expand_const_vector (rtx target, rtx src)
>  gcc_unreachable ();
>  }
>
> +/* Get the frm mode with given CONST_INT rtx, the default mode is
> +   FRM_DYN.  */
> +enum floating_point_rounding_mode
> +get_frm_mode (rtx operand)
> +{
> +  gcc_assert (CONST_INT_P (operand));
> +
> +  switch (INTVAL (operand))
> +{
> +case FRM_RNE:
> +  return FRM_RNE;
> +case FRM_RTZ:
> +  return FRM_RTZ;
> +case FRM_RDN:
> +  return FRM_RDN;
> +case FRM_RUP:
> +  return FRM_RUP;
> +case FRM_RMM:
> +  return FRM_RMM;
> +case FRM_DYN:
> +  return FRM_DYN;
> +default:
> +  gcc_unreachable ();
> +}
> +
> +  gcc_unreachable ();
> +}
> +
>  /* Expand a pre-RA RVV data move from SRC to DEST.
> It expands move for RVV fractional vector modes.  */
>  bool
> diff --git a/gcc/config/riscv/riscv-vector-builtins.cc
> b/gcc/config/riscv/riscv-vector-builtins.cc
> index 528dca7ae85..abab06c00ed 100644
> --- a/gcc/config/riscv/riscv-vector-builtins.cc
> +++ b/gcc/config/riscv/riscv-vector-builtins.cc
> @@ -3730,17 +3730,29 @@ function_expander::use_ternop_insn (bool
> vd_accum_p, insn_code icode)
>  }
>
>for (int argno = arg_offset; argno < call_expr_nargs (exp); argno++)
> -add_input_operand (argno);
> +{
> +  if (base->has_rounding_mode_operand_p ()
> +

RE: [PATCH v2] RISC-V: Refactor RVV frm_mode attr for rounding mode intrinsic

2023-08-09 Thread Li, Pan2 via Gcc-patches
Thanks Kito, update it in PATCH v3, and passed riscv/rvv.exp already.

https://gcc.gnu.org/pipermail/gcc-patches/2023-August/626918.html

Pan

-Original Message-
From: Kito Cheng  
Sent: Thursday, August 10, 2023 10:21 AM
To: Li, Pan2 
Cc: Kito Cheng ; gcc-patches@gcc.gnu.org; 
juzhe.zh...@rivai.ai; jeffreya...@gmail.com; Wang, Yanzhang 

Subject: Re: [PATCH v2] RISC-V: Refactor RVV frm_mode attr for rounding mode 
intrinsic

Yeah, no further comment from me :)

On Thu, Aug 10, 2023 at 10:16 AM Li, Pan2  wrote:
>
> Thanks kito. It makes sense, should not reach default, may I prepare v3(add 
> gcc_unreachable to default) if no more comments?
>
> Pan
>
> -Original Message-
> From: Kito Cheng 
> Sent: Thursday, August 10, 2023 10:12 AM
> To: Li, Pan2 
> Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; jeffreya...@gmail.com; 
> Wang, Yanzhang ; Kito Cheng 
> Subject: Re: [PATCH v2] RISC-V: Refactor RVV frm_mode attr for rounding mode 
> intrinsic
>
> > +/* Get the frm mode with given CONST_INT rtx, the default mode is
> > +   FRM_DYN.  */
> > +enum floating_point_rounding_mode
> > +get_frm_mode (rtx operand)
> > +{
> > +  gcc_assert (CONST_INT_P (operand));
> > +
> > +  switch (INTVAL (operand))
> > +{
> > +case FRM_RNE:
> > +  return FRM_RNE;
> > +case FRM_RTZ:
> > +  return FRM_RTZ;
> > +case FRM_RDN:
> > +  return FRM_RDN;
> > +case FRM_RUP:
> > +  return FRM_RUP;
> > +case FRM_RMM:
> > +  return FRM_RMM;
> > +case FRM_DYN:
> > +  return FRM_DYN;
> > +default:
> > +  return FRM_DYN;
>
> Should we put a gcc_unreachable or gcc_assert here? I am not sure if
> another value is valid when it appeared for this operand?
>
> > +}
> > +
> > +  gcc_unreachable ();
> > +}


[PATCH v3] RISC-V: Refactor RVV frm_mode attr for rounding mode intrinsic

2023-08-09 Thread Pan Li via Gcc-patches
From: Pan Li 

The frm_mode attr has some assumptions for each define insn as below.

1. The define insn has at least 9 operands.
2. The operands[9] must be frm reg.
3. The operands[9] must be const int.

Actually, the frm operand can be operands[8], operands[9] or
operands[10], and not all the define insn has frm operands.

This patch would like to refactor frm and eliminate the above
assumptions, as well as unblock the underlying rounding mode intrinsic
API support.

After refactor, the default frm will be none, and the selected insn type
will be dyn. For the floating point which honors the frm, we will
set the frm_mode attr explicitly in define_insn.

Signed-off-by: Pan Li 
Co-Authored-by: Kito Cheng 

gcc/ChangeLog:

* config/riscv/riscv-protos.h (get_frm_mode): Remove operand
assumptions.
* config/riscv/riscv-v.cc (get_frm_mode): New function.
* config/riscv/riscv-vector-builtins.cc
(function_expander::use_ternop_insn):
* config/riscv/vector.md: Set frm_mode attr explicitly.

gcc/ChangeLog:

* config/riscv/riscv-protos.h
(enum floating_point_rounding_mode): Add NONE, DYN_EXIT and DYN_CALL.
(get_frm_mode): New declaration.
* config/riscv/riscv-v.cc (get_frm_mode): New function to get frm mode.
* config/riscv/riscv-vector-builtins.cc
(function_expander::use_ternop_insn): Take care of frm reg.
* config/riscv/riscv.cc (riscv_static_frm_mode_p): Migrate to FRM_XXX.
(riscv_emit_frm_mode_set): Ditto.
(riscv_emit_mode_set): Ditto.
(riscv_frm_adjust_mode_after_call): Ditto.
(riscv_frm_mode_needed): Ditto.
(riscv_frm_mode_after): Ditto.
(riscv_mode_entry): Ditto.
(riscv_mode_exit): Ditto.
* config/riscv/riscv.h (NUM_MODES_FOR_MODE_SWITCHING): Ditto.
* config/riscv/vector.md
(rne,rtz,rdn,rup,rmm,dyn,dyn_exit,dyn_call,none): Removed
(symbol_ref): * config/riscv/vector.md: Set frm_mode attr explicitly.
---
 gcc/config/riscv/riscv-protos.h   |   4 +
 gcc/config/riscv/riscv-v.cc   |  29 
 gcc/config/riscv/riscv-vector-builtins.cc |  22 ++-
 gcc/config/riscv/riscv.cc |  46 +++---
 gcc/config/riscv/riscv.h  |   2 +-
 gcc/config/riscv/vector.md| 172 ++
 6 files changed, 186 insertions(+), 89 deletions(-)

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index ac9c4b02f17..2fbed04ff84 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -349,8 +349,12 @@ enum floating_point_rounding_mode
   FRM_DYN = 7, /* Aka 0b111.  */
   FRM_STATIC_MIN = FRM_RNE,
   FRM_STATIC_MAX = FRM_RMM,
+  FRM_DYN_EXIT = 8,
+  FRM_DYN_CALL = 9,
+  FRM_NONE = 10,
 };
 
+enum floating_point_rounding_mode get_frm_mode (rtx);
 opt_machine_mode vectorize_related_mode (machine_mode, scalar_mode,
 poly_uint64);
 unsigned int autovectorize_vector_modes (vec *, bool);
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 0bea04c1967..c9f0a4a9e7b 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -112,6 +112,7 @@ public:
   {
 m_has_fp_rounding_mode_p = true;
 m_fp_rounding_mode = mode;
+gcc_assert (mode <= FRM_DYN);
   }
 
   void add_output_operand (rtx x, machine_mode mode)
@@ -1590,6 +1591,34 @@ expand_const_vector (rtx target, rtx src)
 gcc_unreachable ();
 }
 
+/* Get the frm mode with given CONST_INT rtx, the default mode is
+   FRM_DYN.  */
+enum floating_point_rounding_mode
+get_frm_mode (rtx operand)
+{
+  gcc_assert (CONST_INT_P (operand));
+
+  switch (INTVAL (operand))
+{
+case FRM_RNE:
+  return FRM_RNE;
+case FRM_RTZ:
+  return FRM_RTZ;
+case FRM_RDN:
+  return FRM_RDN;
+case FRM_RUP:
+  return FRM_RUP;
+case FRM_RMM:
+  return FRM_RMM;
+case FRM_DYN:
+  return FRM_DYN;
+default:
+  gcc_unreachable ();
+}
+
+  gcc_unreachable ();
+}
+
 /* Expand a pre-RA RVV data move from SRC to DEST.
It expands move for RVV fractional vector modes.  */
 bool
diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
b/gcc/config/riscv/riscv-vector-builtins.cc
index 528dca7ae85..abab06c00ed 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -3730,17 +3730,29 @@ function_expander::use_ternop_insn (bool vd_accum_p, 
insn_code icode)
 }
 
   for (int argno = arg_offset; argno < call_expr_nargs (exp); argno++)
-add_input_operand (argno);
+{
+  if (base->has_rounding_mode_operand_p ()
+ && argno == call_expr_nargs (exp) - 2)
+   {
+ /* Since the rounding mode argument position is not consistent with
+the instruction pattern, we need to skip rounding mode argument
+here.  */
+ continue;
+   }
+  add_input_operand (argno);
+  

[PATCH 1/1] RISC-V: Make "prefetch.i" built-in usable

2023-08-09 Thread Tsukasa OI via Gcc-patches
From: Tsukasa OI 

The "__builtin_riscv_zicbop_cbo_prefetchi" built-in function was terribly
broken so that practically unusable.  It emitted "prefetch.i" but with no
meaningful arguments.

Though incompatible, this commit completely changes the function prototype
of this built-in and makes it usable.  To minimize the functionality issues,
it renames the built-in to "__builtin_riscv_zicbop_prefetch_i".

gcc/ChangeLog:

* config/riscv/riscv-cmo.def: Fix function prototype.
* config/riscv/riscv.md (riscv_prefetchi_): Fix instruction
prototype.  Remove possible prefectch type argument
* doc/extend.texi: Document __builtin_riscv_zicbop_prefetch_i.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/cmo-zicbop-1.c: Reflect new built-in prototype.
* gcc.target/riscv/cmo-zicbop-2.c: Likewise.
---
 gcc/config/riscv/riscv-cmo.def|  4 ++--
 gcc/config/riscv/riscv.md |  5 ++---
 gcc/doc/extend.texi   |  7 +++
 gcc/testsuite/gcc.target/riscv/cmo-zicbop-1.c |  8 +---
 gcc/testsuite/gcc.target/riscv/cmo-zicbop-2.c | 10 ++
 5 files changed, 22 insertions(+), 12 deletions(-)

diff --git a/gcc/config/riscv/riscv-cmo.def b/gcc/config/riscv/riscv-cmo.def
index b92044dc6ff9..2286c8a25544 100644
--- a/gcc/config/riscv/riscv-cmo.def
+++ b/gcc/config/riscv/riscv-cmo.def
@@ -13,8 +13,8 @@ RISCV_BUILTIN (zero_si, "zicboz_cbo_zero", 
RISCV_BUILTIN_DIRECT_NO_TARGET, RISCV
 RISCV_BUILTIN (zero_di, "zicboz_cbo_zero", RISCV_BUILTIN_DIRECT_NO_TARGET, 
RISCV_VOID_FTYPE_VOID_PTR, zero64),
 
 // zicbop
-RISCV_BUILTIN (prefetchi_si, "zicbop_cbo_prefetchi", RISCV_BUILTIN_DIRECT, 
RISCV_SI_FTYPE_SI, prefetchi32),
-RISCV_BUILTIN (prefetchi_di, "zicbop_cbo_prefetchi", RISCV_BUILTIN_DIRECT, 
RISCV_DI_FTYPE_DI, prefetchi64),
+RISCV_BUILTIN (prefetchi_si, "zicbop_prefetch_i", 
RISCV_BUILTIN_DIRECT_NO_TARGET, RISCV_VOID_FTYPE_VOID_PTR, prefetchi32),
+RISCV_BUILTIN (prefetchi_di, "zicbop_prefetch_i", 
RISCV_BUILTIN_DIRECT_NO_TARGET, RISCV_VOID_FTYPE_VOID_PTR, prefetchi64),
 
 // zbkc or zbc
 RISCV_BUILTIN (clmul_si, "clmul", RISCV_BUILTIN_DIRECT, RISCV_SI_FTYPE_SI_SI, 
clmul_zbkc32_or_zbc32),
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 688fd697255b..5658c7b7e113 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -3273,9 +3273,8 @@
 })
 
 (define_insn "riscv_prefetchi_"
-  [(unspec_volatile:X [(match_operand:X 0 "address_operand" "r")
-  (match_operand:X 1 "imm5_operand" "i")]
-  UNSPECV_PREI)]
+  [(unspec_volatile:X [(match_operand:X 0 "register_operand" "r")]
+UNSPECV_PREI)]
   "TARGET_ZICBOP"
   "prefetch.i\t%a0"
 )
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 89c5b4ea2b20..0eb98fc89e3f 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -21575,6 +21575,13 @@ Xgnuzihintpausestate extension, which redefines the 
@code{pause} instruction to
 change architectural state.
 @enddefbuiltin
 
+@defbuiltin{void __builtin_riscv_zicbop_prefetch_i (void *@var{addr})}
+Generates the @code{prefetch.i} machine instruction to instruct the hardware
+that a cache block containing @var{addr} is likely to be accessed by an
+instruction fetch in the near future.
+Available if the Zicbop extension is enabled.
+@enddefbuiltin
+
 @node RISC-V Vector Intrinsics
 @subsection RISC-V Vector Intrinsics
 
diff --git a/gcc/testsuite/gcc.target/riscv/cmo-zicbop-1.c 
b/gcc/testsuite/gcc.target/riscv/cmo-zicbop-1.c
index c5d78c1763d3..0d5b58c4fadf 100644
--- a/gcc/testsuite/gcc.target/riscv/cmo-zicbop-1.c
+++ b/gcc/testsuite/gcc.target/riscv/cmo-zicbop-1.c
@@ -13,11 +13,13 @@ void foo (char *p)
   __builtin_prefetch (p, 1, 3);
 }
 
-int foo1()
+void foo1 ()
 {
-  return __builtin_riscv_zicbop_cbo_prefetchi(1);
+  __builtin_riscv_zicbop_prefetch_i(0);
+  __builtin_riscv_zicbop_prefetch_i();
+  __builtin_riscv_zicbop_prefetch_i((void*)0x111);
 }
 
-/* { dg-final { scan-assembler-times "prefetch.i" 1 } } */
+/* { dg-final { scan-assembler-times "prefetch.i" 3 } } */
 /* { dg-final { scan-assembler-times "prefetch.r" 4 } } */
 /* { dg-final { scan-assembler-times "prefetch.w" 4 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/cmo-zicbop-2.c 
b/gcc/testsuite/gcc.target/riscv/cmo-zicbop-2.c
index 6576365b39ca..09655c4b8593 100644
--- a/gcc/testsuite/gcc.target/riscv/cmo-zicbop-2.c
+++ b/gcc/testsuite/gcc.target/riscv/cmo-zicbop-2.c
@@ -13,11 +13,13 @@ void foo (char *p)
   __builtin_prefetch (p, 1, 3);
 }
 
-int foo1()
+void foo1 ()
 {
-  return __builtin_riscv_zicbop_cbo_prefetchi(1);
+  __builtin_riscv_zicbop_prefetch_i(0);
+  __builtin_riscv_zicbop_prefetch_i();
+  __builtin_riscv_zicbop_prefetch_i((void*)0x111);
 }
 
-/* { dg-final { scan-assembler-times "prefetch.i" 1 } } */
+/* { dg-final { scan-assembler-times "prefetch.i" 3 } } */
 /* { dg-final { scan-assembler-times "prefetch.r" 4 } } */
-/* { dg-final { scan-assembler-times "prefetch.w" 4 } } */ 
+/* { 

[PATCH 0/1] RISC-V: Make "prefetch.i" built-in usable

2023-08-09 Thread Tsukasa OI via Gcc-patches
Hello,

I found that a built-in function "__builtin_riscv_zicbop_cbo_prefetchi" is
terribly broken so that this is practically unusable.  It emits the
"prefetch.i" machine instruction HINT but with no usable arguments.

Contents of a.c:

> void function_to_be_called(void);
> 
> void sample(void)
> {
> __builtin_riscv_zicbop_cbo_prefetchi(_to_be_called);
> function_to_be_called();
> }


Compiling with the 'Zicbop' extension fails.

> $ riscv64-unknown-elf-gcc -O2 -march=rv64i_zicbop -mabi=lp64 -c a.c
> a.c: In function 'sample':
> a.c:5:42: warning: passing argument 1 of 
> '__builtin_riscv_zicbop_cbo_prefetchi' makes integer from pointer without a 
> cast [-Wint-conversion]
> 5 | __builtin_riscv_zicbop_cbo_prefetchi(_to_be_called);
>   |  ^~
>   |  |
>   |  void (*)(void)
> a.c:5:42: note: expected 'long int' but argument is of type 'void (*)(void)'
> a.c:5:5: error: invalid argument to built-in function
> 5 | __builtin_riscv_zicbop_cbo_prefetchi(_to_be_called);
>   | ^~~~

seems the prototype of this built-in function is...

> int  __builtin_riscv_zicbop_cbo_prefetchi(int);  // RV32
> long __builtin_riscv_zicbop_cbo_prefetchi(long); // RV64

I started to have a bad feeling about this.




Looking at the test case, following code compiles (sort of).

> void sample(void)
> {
> __builtin_riscv_zicbop_cbo_prefetchi(1); /* integer constant: 0-4 
> (inclusive) */
> }

WHAT IS THIS PREFETCHING!?

The answer was obvious (tested with GCC 13.2.0).

> $ riscv64-unknown-elf-gcc -O2 -march=rv64i_zicbop -mabi=lp64 -S a.c
> $ cat a.s
> .file   "a.c"
> .option nopic
> .attribute arch, "rv64i2p1_zicbop1p0"
> .attribute unaligned_access, 0
> .attribute stack_align, 16
> .text
> .align  2
> .globl  sample
> .type   sample, @function
> sample:
> li  a5,0
> prefetch.i  0(a5)
> ret
> .size   sample, .-sample
> .ident  "GCC: (gc891d8dc23e1) 13.2.0"

... none (NULL).  Even the prefetching constant (1) is ignored.




It makes no sense whatsoever.  This patch set is definitely compatibility-
breaking but necessary to make "prefetch.i" work.

This patch set makes following built-in function.

> void  __builtin_riscv_zicbop_prefetch_i(void*);

Did you notice that I renamed the function?  There are three reasons:

1.  We needed to break the compatibility anyway.
2.  To avoid functionality problems.
Co-existing functional and non-functional
__builtin_riscv_zicbop_cbo_prefetchi built-in felt like a nightmare.
3.  Although this is also a cache block operation, the instruction
"prefetch.i" does not have the name "cbo" in it
(unlike "__builtin_riscv_zicbom_cbo_clean" corresponding "cbo.clean").


Let's make a working sample (a.c):

> void function_to_be_called(void);
> 
> void sample(void)
> {
> __builtin_riscv_zicbop_prefetch_i(_to_be_called);
> function_to_be_called();
> }

And take look at the output assembly and its relocations.

> $ riscv64-unknown-elf-gcc -O2 -march=rv64i_zicbop -mabi=lp64 -S a.c
> $ cat a.s
> .file   "a.c"
> .option nopic
> .attribute arch, "rv64i2p1_zicbop1p0"
> .attribute unaligned_access, 0
> .attribute stack_align, 16
> .text
> .align  2
> .globl  sample
> .type   sample, @function
> sample:
> lui a5,%hi(function_to_be_called)
> addia5,a5,%lo(function_to_be_called)
> prefetch.i  0(a5)
> tailfunction_to_be_called
> .size   sample, .-sample
> .ident  "GCC: (GNU) 14.0.0 20230810 (experimental)"



I hope this issue is fixed soon.

Sincerely,
Tsukasa




Tsukasa OI (1):
  RISC-V: Make "prefetch.i" built-in usable

 gcc/config/riscv/riscv-cmo.def|  4 ++--
 gcc/config/riscv/riscv.md |  5 ++---
 gcc/doc/extend.texi   |  7 +++
 gcc/testsuite/gcc.target/riscv/cmo-zicbop-1.c |  8 +---
 gcc/testsuite/gcc.target/riscv/cmo-zicbop-2.c | 10 ++
 5 files changed, 22 insertions(+), 12 deletions(-)


base-commit: 9b099a83b45b8fcdfc07d518e05d36ea741b2227
-- 
2.41.0



[Bug tree-optimization/83247] simplify (int)a_long < 0 when we know a_long fits in int

2023-08-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83247

--- Comment #4 from Andrew Pinski  ---
Created attachment 55714
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55714=edit
Patch which I am testing

Re: [Patch, fortran] PR109684 - compiling failure: complaining about a final subroutine of a type being not PURE (while it is indeed PURE)

2023-08-09 Thread Jerry D via Gcc-patches

On 8/9/23 7:58 AM, Paul Richard Thomas via Fortran wrote:

I took a look at my calendar and decided to backport right away.

r13-7703-ged049e5d5f36cc0f4318cd93bb6b33ed6f6f2ba7

BTW It is a regression :-)

Paul

On Wed, 9 Aug 2023 at 12:10, Paul Richard Thomas
 wrote:


Committed to trunk as 'obvious' in
r14-3098-gb8ec3c952324f866f191883473922e250be81341

13-branch to follow in a few days.

Paul


Thanks Paul, I was about ready to do it myself.

Jerry


[RFC PATCH 0/2] RISC-V: Make __builtin_riscv_pause 'Zihintpause' only

2023-08-09 Thread Tsukasa OI via Gcc-patches
**WARNING**

Following patch sets are exclusive:

1.  [RFC PATCH v2] RISC-V: __builtin_riscv_pause for all environment
2.  [RFC PATCH] RISC-V: Make __builtin_riscv_pause 'Zihintpause' only (this)

See 
for the background of this patch set.


Changes: (all environment) v1 -> ('Zihintpause'-only) v1

*   Based on the feedback from Jeff Law, I made an alternative patch set
which makes "__builtin_riscv_pause" 'Zihintpause'-only.


Comparison: Patch sets [1] and [2] (this)

*   [1] completely preserves the compatibility with GCC 13
([2] removes __builtin_riscv_pause if the 'Zihintpause' extension is
 absent, making a minor compatibility issue)
*   Because of the nature of the instrinsic, [2] is more natural
("pause" is defined in the 'Zihintpause' extension).


Please consider those patch sets and decide which to apply.

Sincerely,
Tsukasa




Tsukasa OI (2):
  RISC-V: Make __builtin_riscv_pause 'Zihintpause' only
  RISC-V: Fix documentation of __builtin_riscv_pause

 gcc/common/config/riscv/riscv-common.cc|  2 ++
 gcc/config/riscv/riscv-builtins.cc |  4 ++--
 gcc/config/riscv/riscv-opts.h  |  2 ++
 gcc/config/riscv/riscv.md  |  2 +-
 gcc/doc/extend.texi|  6 +++---
 gcc/testsuite/gcc.target/riscv/builtin_pause.c | 10 --
 gcc/testsuite/gcc.target/riscv/zihintpause-1.c | 11 +++
 gcc/testsuite/gcc.target/riscv/zihintpause-2.c | 11 +++
 8 files changed, 32 insertions(+), 16 deletions(-)
 delete mode 100644 gcc/testsuite/gcc.target/riscv/builtin_pause.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zihintpause-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zihintpause-2.c


base-commit: 9b099a83b45b8fcdfc07d518e05d36ea741b2227
-- 
2.41.0



[RFC PATCH 2/2] RISC-V: Fix documentation of __builtin_riscv_pause

2023-08-09 Thread Tsukasa OI via Gcc-patches
From: Tsukasa OI 

This built-in does not imply the 'Xgnuzihintpausestate' extension.
It does not change architectural state (because all HINTs are prohibited
from doing that).

gcc/ChangeLog:

* doc/extend.texi: Fix the description of __builtin_riscv_pause.
---
 gcc/doc/extend.texi | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 89c5b4ea2b20..7be27430666a 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -21570,9 +21570,9 @@ Returns the value that is currently set in the 
@samp{tp} register.
 @enddefbuiltin
 
 @defbuiltin{void __builtin_riscv_pause (void)}
-Generates the @code{pause} (hint) machine instruction.  This implies the
-Xgnuzihintpausestate extension, which redefines the @code{pause} instruction to
-change architectural state.
+Generates the @code{pause} (hint) machine instruction, indicating that the
+current hart should be temporarily paused or slowed down.
+Available if the Zihintpause extension is enabled.
 @enddefbuiltin
 
 @node RISC-V Vector Intrinsics
-- 
2.41.0



[RFC PATCH 1/2] RISC-V: Make __builtin_riscv_pause 'Zihintpause' only

2023-08-09 Thread Tsukasa OI via Gcc-patches
From: Tsukasa OI 

The "pause" RISC-V hint instruction requires the 'Zihintpause' extension
(in the assembler).  However, GCC emits "pause" unconditionally, making
an assembler error while compiling code with __builtin_riscv_pause while
the 'Zihintpause' extension disabled.

Despite that the "pause" instruction code (0x010f) is a HINT and
emitting its instruction code is safe in any environment, enabling the
built-in only for certain environment is not a good idea.

This commit implements handling for the 'Zihintpause' extension and makes
__built_riscv_pause built-in 'Zihintpause'-only.

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc
(riscv_ext_version_table): Implement the 'Zihintpause' extension,
version 2.0.  (riscv_ext_flag_table) Add 'Zihintpause' handling.
* config/riscv/riscv-builtins.cc: Remove availability predicate
"always" and add "hint_pause", corresponding the existence of the
'Zihintpause' extension.
(riscv_builtins) Change che "riscv_pause" requirements.
* config/riscv/riscv-opts.h
(MASK_ZIHINTPAUSE, TARGET_ZIHINTPAUSE): New.
* config/riscv/riscv.md (riscv_pause): Make it only available when
the 'Zihintpause' extension is enabled.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/builtin_pause.c: Removed.
* gcc.target/riscv/zihintpause-1.c:
New test when the 'Zihintpause' extension is enabled.
* gcc.target/riscv/zihintpause-2.c: Likewise.
---
 gcc/common/config/riscv/riscv-common.cc|  2 ++
 gcc/config/riscv/riscv-builtins.cc |  4 ++--
 gcc/config/riscv/riscv-opts.h  |  2 ++
 gcc/config/riscv/riscv.md  |  2 +-
 gcc/testsuite/gcc.target/riscv/builtin_pause.c | 10 --
 gcc/testsuite/gcc.target/riscv/zihintpause-1.c | 11 +++
 gcc/testsuite/gcc.target/riscv/zihintpause-2.c | 11 +++
 7 files changed, 29 insertions(+), 13 deletions(-)
 delete mode 100644 gcc/testsuite/gcc.target/riscv/builtin_pause.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zihintpause-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zihintpause-2.c

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index f25131eab28b..b77fdb909567 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -209,6 +209,7 @@ static const struct riscv_ext_version 
riscv_ext_version_table[] =
   {"zkt",   ISA_SPEC_CLASS_NONE, 1, 0},
 
   {"zihintntl", ISA_SPEC_CLASS_NONE, 1, 0},
+  {"zihintpause", ISA_SPEC_CLASS_NONE, 2, 0},
 
   {"zicboz",ISA_SPEC_CLASS_NONE, 1, 0},
   {"zicbom",ISA_SPEC_CLASS_NONE, 1, 0},
@@ -1343,6 +1344,7 @@ static const riscv_ext_flag_table_t 
riscv_ext_flag_table[] =
   {"zkt",_options::x_riscv_zk_subext, MASK_ZKT},
 
   {"zihintntl", _options::x_riscv_zi_subext, MASK_ZIHINTNTL},
+  {"zihintpause", _options::x_riscv_zi_subext, MASK_ZIHINTPAUSE},
 
   {"zicboz", _options::x_riscv_zicmo_subext, MASK_ZICBOZ},
   {"zicbom", _options::x_riscv_zicmo_subext, MASK_ZICBOM},
diff --git a/gcc/config/riscv/riscv-builtins.cc 
b/gcc/config/riscv/riscv-builtins.cc
index 79681d759628..73e1512a5992 100644
--- a/gcc/config/riscv/riscv-builtins.cc
+++ b/gcc/config/riscv/riscv-builtins.cc
@@ -122,7 +122,7 @@ AVAIL (clmul_zbkc32_or_zbc32, (TARGET_ZBKC || TARGET_ZBC) 
&& !TARGET_64BIT)
 AVAIL (clmul_zbkc64_or_zbc64, (TARGET_ZBKC || TARGET_ZBC) && TARGET_64BIT)
 AVAIL (clmulr_zbc32, TARGET_ZBC && !TARGET_64BIT)
 AVAIL (clmulr_zbc64, TARGET_ZBC && TARGET_64BIT)
-AVAIL (always, (!0))
+AVAIL (hint_pause, TARGET_ZIHINTPAUSE)
 
 /* Construct a riscv_builtin_description from the given arguments.
 
@@ -179,7 +179,7 @@ static const struct riscv_builtin_description 
riscv_builtins[] = {
 
   DIRECT_BUILTIN (frflags, RISCV_USI_FTYPE, hard_float),
   DIRECT_NO_TARGET_BUILTIN (fsflags, RISCV_VOID_FTYPE_USI, hard_float),
-  DIRECT_NO_TARGET_BUILTIN (pause, RISCV_VOID_FTYPE, always),
+  DIRECT_NO_TARGET_BUILTIN (pause, RISCV_VOID_FTYPE, hint_pause),
 };
 
 /* Index I is the function declaration for riscv_builtins[I], or null if the
diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index 28d9b81bd800..a6c3e0c9098f 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -102,10 +102,12 @@ enum riscv_entity
 #define MASK_ZICSR(1 << 0)
 #define MASK_ZIFENCEI (1 << 1)
 #define MASK_ZIHINTNTL (1 << 2)
+#define MASK_ZIHINTPAUSE (1 << 3)
 
 #define TARGET_ZICSR((riscv_zi_subext & MASK_ZICSR) != 0)
 #define TARGET_ZIFENCEI ((riscv_zi_subext & MASK_ZIFENCEI) != 0)
 #define TARGET_ZIHINTNTL ((riscv_zi_subext & MASK_ZIHINTNTL) != 0)
+#define TARGET_ZIHINTPAUSE ((riscv_zi_subext & MASK_ZIHINTPAUSE) != 0)
 
 #define MASK_ZAWRS   (1 << 0)
 #define TARGET_ZAWRS ((riscv_za_subext & MASK_ZAWRS) != 0)
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 688fd697255b..44ea1c54b96b 100644
--- 

[RFC PATCH v2 1/2] RISC-V: __builtin_riscv_pause for all environment

2023-08-09 Thread Tsukasa OI via Gcc-patches
From: Tsukasa OI 

The "pause" RISC-V hint instruction requires the 'Zihintpause' extension
(in the assembler).  However, GCC emits "pause" unconditionally, making
an assembler error while compiling code with __builtin_riscv_pause while
the 'Zihintpause' extension disabled.

However, the "pause" instruction code (0x010f) is a HINT and emitting
its instruction code is safe in any environment.

This commit implements handling for the 'Zihintpause' extension and emits
".insn 0x010f" instead of "pause" only if the extension is disabled
(making the diagnostics better).

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc
(riscv_ext_version_table): Implement the 'Zihintpause' extension,
version 2.0.  (riscv_ext_flag_table) Add 'Zihintpause' handling.
* config/riscv/riscv-builtins.cc: Remove availability predicate
"always" and add "hint_pause" and "hint_pause_pseudo", corresponding
the existence of the 'Zihintpause' extension.
(riscv_builtins) Split builtin implementation depending on the
existence of the 'Zihintpause' extension.
* config/riscv/riscv-opts.h
(MASK_ZIHINTPAUSE, TARGET_ZIHINTPAUSE): New.
* config/riscv/riscv.md (riscv_pause): Make it only available when
the 'Zihintpause' extension is enabled.  (riscv_pause_insn) New
"pause" implementation when the 'Zihintpause' extension is disabled.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/builtin_pause.c: Removed.
* gcc.target/riscv/zihintpause-1.c:
New test when the 'Zihintpause' extension is enabled.
* gcc.target/riscv/zihintpause-2.c: Likewise.
* gcc.target/riscv/zihintpause-noarch.c:
New test when the 'Zihintpause' extension is disabled.
---
 gcc/common/config/riscv/riscv-common.cc |  2 ++
 gcc/config/riscv/riscv-builtins.cc  |  6 --
 gcc/config/riscv/riscv-opts.h   |  2 ++
 gcc/config/riscv/riscv.md   |  7 ++-
 gcc/testsuite/gcc.target/riscv/builtin_pause.c  | 10 --
 gcc/testsuite/gcc.target/riscv/zihintpause-1.c  | 11 +++
 gcc/testsuite/gcc.target/riscv/zihintpause-2.c  | 11 +++
 gcc/testsuite/gcc.target/riscv/zihintpause-noarch.c | 12 
 8 files changed, 48 insertions(+), 13 deletions(-)
 delete mode 100644 gcc/testsuite/gcc.target/riscv/builtin_pause.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zihintpause-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zihintpause-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zihintpause-noarch.c

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index f25131eab28b..b77fdb909567 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -209,6 +209,7 @@ static const struct riscv_ext_version 
riscv_ext_version_table[] =
   {"zkt",   ISA_SPEC_CLASS_NONE, 1, 0},
 
   {"zihintntl", ISA_SPEC_CLASS_NONE, 1, 0},
+  {"zihintpause", ISA_SPEC_CLASS_NONE, 2, 0},
 
   {"zicboz",ISA_SPEC_CLASS_NONE, 1, 0},
   {"zicbom",ISA_SPEC_CLASS_NONE, 1, 0},
@@ -1343,6 +1344,7 @@ static const riscv_ext_flag_table_t 
riscv_ext_flag_table[] =
   {"zkt",_options::x_riscv_zk_subext, MASK_ZKT},
 
   {"zihintntl", _options::x_riscv_zi_subext, MASK_ZIHINTNTL},
+  {"zihintpause", _options::x_riscv_zi_subext, MASK_ZIHINTPAUSE},
 
   {"zicboz", _options::x_riscv_zicmo_subext, MASK_ZICBOZ},
   {"zicbom", _options::x_riscv_zicmo_subext, MASK_ZICBOM},
diff --git a/gcc/config/riscv/riscv-builtins.cc 
b/gcc/config/riscv/riscv-builtins.cc
index 79681d759628..554fb7f69bb0 100644
--- a/gcc/config/riscv/riscv-builtins.cc
+++ b/gcc/config/riscv/riscv-builtins.cc
@@ -122,7 +122,8 @@ AVAIL (clmul_zbkc32_or_zbc32, (TARGET_ZBKC || TARGET_ZBC) 
&& !TARGET_64BIT)
 AVAIL (clmul_zbkc64_or_zbc64, (TARGET_ZBKC || TARGET_ZBC) && TARGET_64BIT)
 AVAIL (clmulr_zbc32, TARGET_ZBC && !TARGET_64BIT)
 AVAIL (clmulr_zbc64, TARGET_ZBC && TARGET_64BIT)
-AVAIL (always, (!0))
+AVAIL (hint_pause, TARGET_ZIHINTPAUSE)
+AVAIL (hint_pause_pseudo, !TARGET_ZIHINTPAUSE)
 
 /* Construct a riscv_builtin_description from the given arguments.
 
@@ -179,7 +180,8 @@ static const struct riscv_builtin_description 
riscv_builtins[] = {
 
   DIRECT_BUILTIN (frflags, RISCV_USI_FTYPE, hard_float),
   DIRECT_NO_TARGET_BUILTIN (fsflags, RISCV_VOID_FTYPE_USI, hard_float),
-  DIRECT_NO_TARGET_BUILTIN (pause, RISCV_VOID_FTYPE, always),
+  RISCV_BUILTIN (pause, "pause", RISCV_BUILTIN_DIRECT_NO_TARGET, 
RISCV_VOID_FTYPE, hint_pause),
+  RISCV_BUILTIN (pause_insn, "pause", RISCV_BUILTIN_DIRECT_NO_TARGET, 
RISCV_VOID_FTYPE, hint_pause_pseudo),
 };
 
 /* Index I is the function declaration for riscv_builtins[I], or null if the
diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index 28d9b81bd800..a6c3e0c9098f 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h

[RFC PATCH v2 2/2] RISC-V: Fix documentation of __builtin_riscv_pause

2023-08-09 Thread Tsukasa OI via Gcc-patches
From: Tsukasa OI 

This built-in does not imply the 'Xgnuzihintpausestate' extension.
It does not change architectural state (because all HINTs are prohibited
from doing that).

gcc/ChangeLog:

* doc/extend.texi: Fix the description of __builtin_riscv_pause.
---
 gcc/doc/extend.texi | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 89c5b4ea2b20..7ebbe70c78d6 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -21570,9 +21570,9 @@ Returns the value that is currently set in the 
@samp{tp} register.
 @enddefbuiltin
 
 @defbuiltin{void __builtin_riscv_pause (void)}
-Generates the @code{pause} (hint) machine instruction.  This implies the
-Xgnuzihintpausestate extension, which redefines the @code{pause} instruction to
-change architectural state.
+Generates the @code{pause} (hint) machine instruction.  If the target 
implements
+the Zihintpause extension, it indicates that the current hart should be
+temporarily paused or slowed down.
 @enddefbuiltin
 
 @node RISC-V Vector Intrinsics
-- 
2.41.0



[RFC PATCH v2 0/2] RISC-V: __builtin_riscv_pause for all environment

2023-08-09 Thread Tsukasa OI via Gcc-patches
**WARNING**

Following patch sets are exclusive:

1.  [RFC PATCH v2] RISC-V: __builtin_riscv_pause for all environment (this)
2.  [RFC PATCH] RISC-V: Make __builtin_riscv_pause 'Zihintpause' only

See 
for the background of this patch set.


Changes: v1 -> v2

*   Improve test case to test both RV32 and RV64 (+'Zihintpause').


Comparison: Patch sets [1] (this) and [2]

*   [1] completely preserves the compatibility with GCC 13
([2] removes __builtin_riscv_pause if the 'Zihintpause' extension is
 absent, making a minor compatibility issue)
*   Because of the nature of the instrinsic, [2] is more natural
("pause" is defined in the 'Zihintpause' extension).


Please consider those patch sets and decide which to apply.

Sincerely,
Tsukasa




Tsukasa OI (2):
  RISC-V: __builtin_riscv_pause for all environment
  RISC-V: Fix documentation of __builtin_riscv_pause

 gcc/common/config/riscv/riscv-common.cc |  2 ++
 gcc/config/riscv/riscv-builtins.cc  |  6 --
 gcc/config/riscv/riscv-opts.h   |  2 ++
 gcc/config/riscv/riscv.md   |  7 ++-
 gcc/doc/extend.texi |  6 +++---
 gcc/testsuite/gcc.target/riscv/builtin_pause.c  | 10 --
 gcc/testsuite/gcc.target/riscv/zihintpause-1.c  | 11 +++
 gcc/testsuite/gcc.target/riscv/zihintpause-2.c  | 11 +++
 gcc/testsuite/gcc.target/riscv/zihintpause-noarch.c | 12 
 9 files changed, 51 insertions(+), 16 deletions(-)
 delete mode 100644 gcc/testsuite/gcc.target/riscv/builtin_pause.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zihintpause-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zihintpause-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zihintpause-noarch.c


base-commit: 9b099a83b45b8fcdfc07d518e05d36ea741b2227
-- 
2.41.0



Re: [PATCH v2] RISC-V: Refactor RVV frm_mode attr for rounding mode intrinsic

2023-08-09 Thread Kito Cheng via Gcc-patches
Yeah, no further comment from me :)

On Thu, Aug 10, 2023 at 10:16 AM Li, Pan2  wrote:
>
> Thanks kito. It makes sense, should not reach default, may I prepare v3(add 
> gcc_unreachable to default) if no more comments?
>
> Pan
>
> -Original Message-
> From: Kito Cheng 
> Sent: Thursday, August 10, 2023 10:12 AM
> To: Li, Pan2 
> Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; jeffreya...@gmail.com; 
> Wang, Yanzhang ; Kito Cheng 
> Subject: Re: [PATCH v2] RISC-V: Refactor RVV frm_mode attr for rounding mode 
> intrinsic
>
> > +/* Get the frm mode with given CONST_INT rtx, the default mode is
> > +   FRM_DYN.  */
> > +enum floating_point_rounding_mode
> > +get_frm_mode (rtx operand)
> > +{
> > +  gcc_assert (CONST_INT_P (operand));
> > +
> > +  switch (INTVAL (operand))
> > +{
> > +case FRM_RNE:
> > +  return FRM_RNE;
> > +case FRM_RTZ:
> > +  return FRM_RTZ;
> > +case FRM_RDN:
> > +  return FRM_RDN;
> > +case FRM_RUP:
> > +  return FRM_RUP;
> > +case FRM_RMM:
> > +  return FRM_RMM;
> > +case FRM_DYN:
> > +  return FRM_DYN;
> > +default:
> > +  return FRM_DYN;
>
> Should we put a gcc_unreachable or gcc_assert here? I am not sure if
> another value is valid when it appeared for this operand?
>
> > +}
> > +
> > +  gcc_unreachable ();
> > +}


RE: [PATCH v2] RISC-V: Refactor RVV frm_mode attr for rounding mode intrinsic

2023-08-09 Thread Li, Pan2 via Gcc-patches
Thanks kito. It makes sense, should not reach default, may I prepare v3(add 
gcc_unreachable to default) if no more comments?

Pan

-Original Message-
From: Kito Cheng  
Sent: Thursday, August 10, 2023 10:12 AM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; jeffreya...@gmail.com; Wang, 
Yanzhang ; Kito Cheng 
Subject: Re: [PATCH v2] RISC-V: Refactor RVV frm_mode attr for rounding mode 
intrinsic

> +/* Get the frm mode with given CONST_INT rtx, the default mode is
> +   FRM_DYN.  */
> +enum floating_point_rounding_mode
> +get_frm_mode (rtx operand)
> +{
> +  gcc_assert (CONST_INT_P (operand));
> +
> +  switch (INTVAL (operand))
> +{
> +case FRM_RNE:
> +  return FRM_RNE;
> +case FRM_RTZ:
> +  return FRM_RTZ;
> +case FRM_RDN:
> +  return FRM_RDN;
> +case FRM_RUP:
> +  return FRM_RUP;
> +case FRM_RMM:
> +  return FRM_RMM;
> +case FRM_DYN:
> +  return FRM_DYN;
> +default:
> +  return FRM_DYN;

Should we put a gcc_unreachable or gcc_assert here? I am not sure if
another value is valid when it appeared for this operand?

> +}
> +
> +  gcc_unreachable ();
> +}


Re: [PATCH] libatomic: Improve ifunc selection on AArch64

2023-08-09 Thread Richard Henderson via Gcc-patches

On 8/9/23 19:11, Richard Henderson wrote:

On 8/4/23 08:05, Wilco Dijkstra via Gcc-patches wrote:

+#ifdef HWCAP_USCAT
+
+#define MIDR_IMPLEMENTOR(midr)    (((midr) >> 24) & 255)
+#define MIDR_PARTNUM(midr)    (((midr) >> 4) & 0xfff)
+
+static inline bool
+ifunc1 (unsigned long hwcap)
+{
+  if (hwcap & HWCAP_USCAT)
+    return true;
+  if (!(hwcap & HWCAP_CPUID))
+    return false;
+
+  unsigned long midr;
+  asm volatile ("mrs %0, midr_el1" : "=r" (midr));
+
+  /* Neoverse N1 supports atomic 128-bit load/store.  */
+  if (MIDR_IMPLEMENTOR (midr) == 'A' && MIDR_PARTNUM(midr) == 0xd0c)
+    return true;
+
+  return false;
+}
+#endif


Why would HWCAP_USCAT not be set by the kernel?

Failing that, I would think you would check ID_AA64MMFR2_EL1.AT.


Answering my own question, N1 does not officially have FEAT_LSE2.


r~



Re: [PATCH] libatomic: Improve ifunc selection on AArch64

2023-08-09 Thread Richard Henderson via Gcc-patches

On 8/4/23 08:05, Wilco Dijkstra via Gcc-patches wrote:

+#ifdef HWCAP_USCAT
+
+#define MIDR_IMPLEMENTOR(midr) (((midr) >> 24) & 255)
+#define MIDR_PARTNUM(midr) (((midr) >> 4) & 0xfff)
+
+static inline bool
+ifunc1 (unsigned long hwcap)
+{
+  if (hwcap & HWCAP_USCAT)
+return true;
+  if (!(hwcap & HWCAP_CPUID))
+return false;
+
+  unsigned long midr;
+  asm volatile ("mrs %0, midr_el1" : "=r" (midr));
+
+  /* Neoverse N1 supports atomic 128-bit load/store.  */
+  if (MIDR_IMPLEMENTOR (midr) == 'A' && MIDR_PARTNUM(midr) == 0xd0c)
+return true;
+
+  return false;
+}
+#endif


Why would HWCAP_USCAT not be set by the kernel?

Failing that, I would think you would check ID_AA64MMFR2_EL1.AT.


r~


Re: [PATCH v2] RISC-V: Refactor RVV frm_mode attr for rounding mode intrinsic

2023-08-09 Thread Kito Cheng via Gcc-patches
> +/* Get the frm mode with given CONST_INT rtx, the default mode is
> +   FRM_DYN.  */
> +enum floating_point_rounding_mode
> +get_frm_mode (rtx operand)
> +{
> +  gcc_assert (CONST_INT_P (operand));
> +
> +  switch (INTVAL (operand))
> +{
> +case FRM_RNE:
> +  return FRM_RNE;
> +case FRM_RTZ:
> +  return FRM_RTZ;
> +case FRM_RDN:
> +  return FRM_RDN;
> +case FRM_RUP:
> +  return FRM_RUP;
> +case FRM_RMM:
> +  return FRM_RMM;
> +case FRM_DYN:
> +  return FRM_DYN;
> +default:
> +  return FRM_DYN;

Should we put a gcc_unreachable or gcc_assert here? I am not sure if
another value is valid when it appeared for this operand?

> +}
> +
> +  gcc_unreachable ();
> +}


[Bug tree-optimization/110248] ivopts could under-cost for some addressing modes on len_{load,store}

2023-08-09 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110248

Kewen Lin  changed:

   What|Removed |Added

 Resolution|--- |FIXED
   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org
 Status|UNCONFIRMED |RESOLVED

--- Comment #11 from Kewen Lin  ---
Should be fixed on trunk, thanks all!

[Bug tree-optimization/83247] simplify (int)a_long < 0 when we know a_long fits in int

2023-08-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83247

--- Comment #3 from Andrew Pinski  ---
This was literally 4 lines which needed to change now.
The use of range_of_expr inside simplify_casted_compare just needed to be
passed the statement and that fixed it 

RE: [PATCH] Support -m[no-]gather -m[no-]scatter to enable/disable vectorization for all gather/scatter instructions.

2023-08-09 Thread Liu, Hongtao via Gcc-patches


> -Original Message-
> From: Xi Ruoyao 
> Sent: Thursday, August 10, 2023 9:48 AM
> To: Liu, Hongtao ; gcc-patches@gcc.gnu.org
> Cc: richard.guent...@gmail.com; ubiz...@gmail.com; hubi...@ucw.cz
> Subject: Re: [PATCH] Support -m[no-]gather -m[no-]scatter to enable/disable
> vectorization for all gather/scatter instructions.
> 
> On Thu, 2023-08-10 at 09:11 +0800, liuhongt via Gcc-patches wrote:
> > Currently we have 3 different independent tunes for gather
> > "use_gather,use_gather_2parts,use_gather_4parts",
> > similar for scatter, there're
> > "use_scatter,use_scatter_2parts,use_scatter_4parts"
> >
> > The patch support 2 standardizing options to enable/disable
> > vectorization for all gather/scatter instructions. The options is
> > interpreted by driver to 3 tunes.
> >
> > bootstrapped and regtested on x86_64-pc-linux-gnu.
> > Ok for trunk?
> 
> And should we set -mno-gather as the default for GDS affected processors?
> We'll likely apply the ucode update for them, and then the gathering
> instructions will be much slower.
Assume you're talking about 
https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/advisory-guidance/gather-data-sampling.html
Yes, there will be an separate patch for microarchitecture tuning.
> 
> > gcc/ChangeLog:
> >
> > * config/i386/i386.h (DRIVER_SELF_SPECS): Add
> > GATHER_SCATTER_DRIVER_SELF_SPECS.
> > (GATHER_SCATTER_DRIVER_SELF_SPECS): New macro.
> > * config/i386/i386.opt (mgather): New option.
> > (mscatter): Ditto.
> > ---
> >  gcc/config/i386/i386.h   | 12 +++-
> >  gcc/config/i386/i386.opt |  8 
> >  2 files changed, 19 insertions(+), 1 deletion(-)
> >
> > diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index
> > ef342fcee9b..d9ac2c29bde 100644
> > --- a/gcc/config/i386/i386.h
> > +++ b/gcc/config/i386/i386.h
> > @@ -565,7 +565,17 @@ extern GTY(()) tree x86_mfence;
> >  # define SUBTARGET_DRIVER_SELF_SPECS ""
> >  #endif
> >
> > -#define DRIVER_SELF_SPECS SUBTARGET_DRIVER_SELF_SPECS
> > +#ifndef GATHER_SCATTER_DRIVER_SELF_SPECS # define
> > +GATHER_SCATTER_DRIVER_SELF_SPECS \
> > +  "%{mno-gather:-mtune-
> > ctrl=^use_gather_2parts,^use_gather_4parts,^use_gather} \
> > +   %{mgather:-mtune-
> > ctrl=use_gather_2parts,use_gather_4parts,use_gather} \
> > +   %{mno-scatter:-mtune-
> > ctrl=^use_scatter_2parts,^use_scatter_4parts,^use_scatter} \
> > +   %{mscatter:-mtune-
> > ctrl=use_scatter_2parts,use_scatter_4parts,use_scatter}"
> > +#endif
> > +
> > +#define DRIVER_SELF_SPECS \
> > +  SUBTARGET_DRIVER_SELF_SPECS " " \
> > +  GATHER_SCATTER_DRIVER_SELF_SPECS
> >
> >  /* -march=native handling only makes sense with compiler running on
> >     an x86 or x86_64 chip.  If changing this condition, also change
> > diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt index
> > ddb7f110aa2..99948644a8d 100644
> > --- a/gcc/config/i386/i386.opt
> > +++ b/gcc/config/i386/i386.opt
> > @@ -424,6 +424,14 @@ mdaz-ftz
> >  Target
> >  Set the FTZ and DAZ Flags.
> >
> > +mgather
> > +Target
> > +Enable vectorization for gather instruction.
> > +
> > +mscatter
> > +Target
> > +Enable vectorization for scatter instruction.
> > +
> >  mpreferred-stack-boundary=
> >  Target RejectNegative Joined UInteger
> > Var(ix86_preferred_stack_boundary_arg)
> >  Attempt to keep stack aligned to this power of 2.
> 
> --
> Xi Ruoyao 
> School of Aerospace Science and Technology, Xidian University


Re: [PATCH] Support -m[no-]gather -m[no-]scatter to enable/disable vectorization for all gather/scatter instructions.

2023-08-09 Thread Xi Ruoyao via Gcc-patches
On Thu, 2023-08-10 at 09:11 +0800, liuhongt via Gcc-patches wrote:
> Currently we have 3 different independent tunes for gather
> "use_gather,use_gather_2parts,use_gather_4parts",
> similar for scatter, there're
> "use_scatter,use_scatter_2parts,use_scatter_4parts"
> 
> The patch support 2 standardizing options to enable/disable
> vectorization for all gather/scatter instructions. The options is
> interpreted by driver to 3 tunes.
> 
> bootstrapped and regtested on x86_64-pc-linux-gnu.
> Ok for trunk?

And should we set -mno-gather as the default for GDS affected
processors?  We'll likely apply the ucode update for them, and then the
gathering instructions will be much slower.

> gcc/ChangeLog:
> 
> * config/i386/i386.h (DRIVER_SELF_SPECS): Add
> GATHER_SCATTER_DRIVER_SELF_SPECS.
> (GATHER_SCATTER_DRIVER_SELF_SPECS): New macro.
> * config/i386/i386.opt (mgather): New option.
> (mscatter): Ditto.
> ---
>  gcc/config/i386/i386.h   | 12 +++-
>  gcc/config/i386/i386.opt |  8 
>  2 files changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
> index ef342fcee9b..d9ac2c29bde 100644
> --- a/gcc/config/i386/i386.h
> +++ b/gcc/config/i386/i386.h
> @@ -565,7 +565,17 @@ extern GTY(()) tree x86_mfence;
>  # define SUBTARGET_DRIVER_SELF_SPECS ""
>  #endif
>  
> -#define DRIVER_SELF_SPECS SUBTARGET_DRIVER_SELF_SPECS
> +#ifndef GATHER_SCATTER_DRIVER_SELF_SPECS
> +# define GATHER_SCATTER_DRIVER_SELF_SPECS \
> +  "%{mno-gather:-mtune-
> ctrl=^use_gather_2parts,^use_gather_4parts,^use_gather} \
> +   %{mgather:-mtune-
> ctrl=use_gather_2parts,use_gather_4parts,use_gather} \
> +   %{mno-scatter:-mtune-
> ctrl=^use_scatter_2parts,^use_scatter_4parts,^use_scatter} \
> +   %{mscatter:-mtune-
> ctrl=use_scatter_2parts,use_scatter_4parts,use_scatter}"
> +#endif
> +
> +#define DRIVER_SELF_SPECS \
> +  SUBTARGET_DRIVER_SELF_SPECS " " \
> +  GATHER_SCATTER_DRIVER_SELF_SPECS
>  
>  /* -march=native handling only makes sense with compiler running on
>     an x86 or x86_64 chip.  If changing this condition, also change
> diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
> index ddb7f110aa2..99948644a8d 100644
> --- a/gcc/config/i386/i386.opt
> +++ b/gcc/config/i386/i386.opt
> @@ -424,6 +424,14 @@ mdaz-ftz
>  Target
>  Set the FTZ and DAZ Flags.
>  
> +mgather
> +Target
> +Enable vectorization for gather instruction.
> +
> +mscatter
> +Target
> +Enable vectorization for scatter instruction.
> +
>  mpreferred-stack-boundary=
>  Target RejectNegative Joined UInteger
> Var(ix86_preferred_stack_boundary_arg)
>  Attempt to keep stack aligned to this power of 2.

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


[PATCH] Support -m[no-]gather -m[no-]scatter to enable/disable vectorization for all gather/scatter instructions.

2023-08-09 Thread liuhongt via Gcc-patches
Currently we have 3 different independent tunes for gather
"use_gather,use_gather_2parts,use_gather_4parts",
similar for scatter, there're
"use_scatter,use_scatter_2parts,use_scatter_4parts"

The patch support 2 standardizing options to enable/disable
vectorization for all gather/scatter instructions. The options is
interpreted by driver to 3 tunes.

bootstrapped and regtested on x86_64-pc-linux-gnu.
Ok for trunk?

gcc/ChangeLog:

* config/i386/i386.h (DRIVER_SELF_SPECS): Add
GATHER_SCATTER_DRIVER_SELF_SPECS.
(GATHER_SCATTER_DRIVER_SELF_SPECS): New macro.
* config/i386/i386.opt (mgather): New option.
(mscatter): Ditto.
---
 gcc/config/i386/i386.h   | 12 +++-
 gcc/config/i386/i386.opt |  8 
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index ef342fcee9b..d9ac2c29bde 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -565,7 +565,17 @@ extern GTY(()) tree x86_mfence;
 # define SUBTARGET_DRIVER_SELF_SPECS ""
 #endif
 
-#define DRIVER_SELF_SPECS SUBTARGET_DRIVER_SELF_SPECS
+#ifndef GATHER_SCATTER_DRIVER_SELF_SPECS
+# define GATHER_SCATTER_DRIVER_SELF_SPECS \
+  "%{mno-gather:-mtune-ctrl=^use_gather_2parts,^use_gather_4parts,^use_gather} 
\
+   %{mgather:-mtune-ctrl=use_gather_2parts,use_gather_4parts,use_gather} \
+   
%{mno-scatter:-mtune-ctrl=^use_scatter_2parts,^use_scatter_4parts,^use_scatter} 
\
+   %{mscatter:-mtune-ctrl=use_scatter_2parts,use_scatter_4parts,use_scatter}"
+#endif
+
+#define DRIVER_SELF_SPECS \
+  SUBTARGET_DRIVER_SELF_SPECS " " \
+  GATHER_SCATTER_DRIVER_SELF_SPECS
 
 /* -march=native handling only makes sense with compiler running on
an x86 or x86_64 chip.  If changing this condition, also change
diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
index ddb7f110aa2..99948644a8d 100644
--- a/gcc/config/i386/i386.opt
+++ b/gcc/config/i386/i386.opt
@@ -424,6 +424,14 @@ mdaz-ftz
 Target
 Set the FTZ and DAZ Flags.
 
+mgather
+Target
+Enable vectorization for gather instruction.
+
+mscatter
+Target
+Enable vectorization for scatter instruction.
+
 mpreferred-stack-boundary=
 Target RejectNegative Joined UInteger Var(ix86_preferred_stack_boundary_arg)
 Attempt to keep stack aligned to this power of 2.
-- 
2.31.1



RISC-V: Enable Hoist to GCSE simple constants

2023-08-09 Thread juzhe.zh...@rivai.ai

-/* { dg-final { scan-assembler-times {j\s+\.L[0-9]+\s+\.L[0-9]+:\s+vlm\.v} 1 { 
target { no-opts "-O0" no-opts "-O1"  no-opts "-Os" no-opts "-Oz" no-opts 
"-funroll-loops" no-opts "-g" } } } } */
+/* { dg-final { scan-assembler-times {j\s+\.L[0-9]+\s+\.L[0-9]+:\s+vlm\.v} 1 { 
target { no-opts "-O0" no-opts "-O1" no-opts "-O2" no-opts "-Os" no-opts "-Oz" 
no-opts "-funroll-loops" no-opts "-g" } } } } */
-/* { dg-final { scan-assembler-times {j\s+\.L[0-9]+\s+\.L[0-9]+:\s+vlm\.v} 1 { 
target { no-opts "-O0" no-opts "-O1"  no-opts "-Os" no-opts "-Oz" no-opts 
"-funroll-loops" no-opts "-g" } } } } */
+/* { dg-final { scan-assembler-times {j\s+\.L[0-9]+\s+\.L[0-9]+:\s+vlm\.v} 1 { 
target { no-opts "-O0" no-opts "-O1" no-opts "-O2"  no-opts "-Os" no-opts "-Oz" 
no-opts "-funroll-loops" no-opts "-g" } } } } */

Instead of disabling O2, you should remove these 2 assembly check since you are 
disabling all option now. 
I have checked this patch, this patch didn't make VSETVL  codegen worse.
Ok from myside.


juzhe.zh...@rivai.ai


Re: Re: [PATCH] RISC-V: Fix VLMAX AVL incorrect local anticipate [VSETVL PASS]

2023-08-09 Thread juzhe.zh...@rivai.ai
Yes. I think so. Will backport GCC 13 soon.



juzhe.zh...@rivai.ai
 
From: Jeff Law
Date: 2023-08-10 01:01
To: Juzhe-Zhong; gcc-patches
CC: kito.cheng; kito.cheng; rdapp.gcc
Subject: Re: [PATCH] RISC-V: Fix VLMAX AVL incorrect local anticipate [VSETVL 
PASS]
 
 
On 8/9/23 04:51, Juzhe-Zhong wrote:
> Realize we have a bug in VSETVL PASS which is triggered by 
> strided_load_run-1.c in RV32 system.
> 
> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c 
> execution test
> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c 
> execution test
> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c 
> execution test
> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c 
> execution test
> 
> This is because VSETVL PASS incorrect hoist vsetvl instruction:
> 
> ...
> 10156: 0d9075d7  vsetvli a1,zero,e64,m2,ta,ma ---> pollute 'a1' 
> register which will be used by following insns.
> 1015a: 01d586b3  add a3,a1,t4  > use 'a1'
> 1015e: 5e070257  vmv.v.v v4,v14
> 10162: b7032257  vmacc.vv v4,v6,v16
> 10166: 26440257  vand.vv v4,v4,v8
> 1016a: 22880227  vs2r.v v4,(a6)
> 1016e: 00b6b7b3  sltu a5,a3,a1
> 10172: 22888227  vs2r.v v4,(a7)
> 10176: 9e60b157  vmv2r.v v2,v6
> 1017a: 97baadd a5,a5,a4
> 1017c: a6a62157  vmadd.vv v2,v12,v10
> 10180: 26240157  vand.vv v2,v2,v8
> 10184: 22830127  vs2r.v v2,(t1)
> 10188: 873emv a4,a5
> 1018a: 982aadd a6,a6,a0
> 1018c: 98aaadd a7,a7,a0
> 1018e: 932aadd t1,t1,a0
> 10190: 85b6mv a1,a3   -> set 'a1'
> ...
> 
> gcc/ChangeLog:
> 
>  * config/riscv/riscv-vsetvl.cc (anticipatable_occurrence_p): Fix 
> incorrect anticipate info.
> 
> gcc/testsuite/ChangeLog:
> 
>  * gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c: 
> Adapt test.
>  * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-24.c: Ditto.
>  * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-25.c: Ditto.
>  * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-26.c: Ditto.
>  * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-36.c: Ditto.
>  * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-14.c: Ditto.
>  * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-15.c: Ditto.
OK.
 
Do we need to backport this to gcc-13?
 
jeff
 


[PATCH] i386: Do not sanitize upper part of V2HFmode and V4HFmode reg with -fno-trapping-math [PR110832]

2023-08-09 Thread liuhongt via Gcc-patches
Also add ix86_partial_vec_fp_math to to condition of V2HF/V4HF named
patterns in order to avoid generation of partial vector V8HFmode
trapping instructions.

Bootstrapped and regtseted on x86_64-pc-linux-gnu{-m32,}
Ok for trunk?

gcc/ChangeLog:

PR target/110832
* config/i386/mmx.md: (movq__to_sse): Also do not
sanitize upper part of V4HFmode register with
-fno-trapping-math.
(v4hf3): Enable for ix86_partial_vec_fp_math.
(v2hf3): Ditto.
(divv2hf3): Ditto.
(movd_v2hf_to_sse): Do not sanitize upper part of V2HFmode
register with -fno-trapping-math.
---
 gcc/config/i386/mmx.md | 20 ++--
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index d51b3b9dc71..170432a7128 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -596,7 +596,7 @@ (define_expand "movq__to_sse"
  (match_dup 2)))]
   "TARGET_SSE2"
 {
-  if (mode == V2SFmode
+  if (mode != V2SImode
   && !flag_trapping_math)
 {
   rtx op1 = force_reg (mode, operands[1]);
@@ -1941,7 +1941,7 @@ (define_expand "v4hf3"
(plusminusmult:V4HF
  (match_operand:V4HF 1 "nonimmediate_operand")
  (match_operand:V4HF 2 "nonimmediate_operand")))]
-  "TARGET_AVX512FP16 && TARGET_AVX512VL"
+  "TARGET_AVX512FP16 && TARGET_AVX512VL && ix86_partial_vec_fp_math"
 {
   rtx op2 = gen_reg_rtx (V8HFmode);
   rtx op1 = gen_reg_rtx (V8HFmode);
@@ -1961,7 +1961,7 @@ (define_expand "divv4hf3"
(div:V4HF
  (match_operand:V4HF 1 "nonimmediate_operand")
  (match_operand:V4HF 2 "nonimmediate_operand")))]
-  "TARGET_AVX512FP16 && TARGET_AVX512VL"
+  "TARGET_AVX512FP16 && TARGET_AVX512VL && ix86_partial_vec_fp_math"
 {
   rtx op2 = gen_reg_rtx (V8HFmode);
   rtx op1 = gen_reg_rtx (V8HFmode);
@@ -1983,14 +1983,22 @@ (define_expand "movd_v2hf_to_sse"
(match_operand:V2HF 1 "nonimmediate_operand"))
  (match_operand:V8HF 2 "reg_or_0_operand")
  (const_int 3)))]
-  "TARGET_SSE")
+  "TARGET_SSE"
+{
+  if (!flag_trapping_math && operands[2] == CONST0_RTX (V8HFmode))
+  {
+rtx op1 = force_reg (V2HFmode, operands[1]);
+emit_move_insn (operands[0], lowpart_subreg (V8HFmode, op1, V2HFmode));
+DONE;
+  }
+})
 
 (define_expand "v2hf3"
   [(set (match_operand:V2HF 0 "register_operand")
(plusminusmult:V2HF
  (match_operand:V2HF 1 "nonimmediate_operand")
  (match_operand:V2HF 2 "nonimmediate_operand")))]
-  "TARGET_AVX512FP16 && TARGET_AVX512VL"
+  "TARGET_AVX512FP16 && TARGET_AVX512VL && ix86_partial_vec_fp_math"
 {
   rtx op2 = gen_reg_rtx (V8HFmode);
   rtx op1 = gen_reg_rtx (V8HFmode);
@@ -2009,7 +2017,7 @@ (define_expand "divv2hf3"
(div:V2HF
  (match_operand:V2HF 1 "nonimmediate_operand")
  (match_operand:V2HF 2 "nonimmediate_operand")))]
-  "TARGET_AVX512FP16 && TARGET_AVX512VL"
+  "TARGET_AVX512FP16 && TARGET_AVX512VL && ix86_partial_vec_fp_math"
 {
   rtx op2 = gen_reg_rtx (V8HFmode);
   rtx op1 = gen_reg_rtx (V8HFmode);
-- 
2.31.1



[Bug modula2/108119] m2rte plugin should be disabled by default

2023-08-09 Thread egallager at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108119

Eric Gallager  changed:

   What|Removed |Added

 Ever confirmed|0   |1
Summary|m2rte Seems like it should  |m2rte plugin should be
   |not be there at all |disabled by default
   Last reconfirmed||2023-08-10
 CC||egallager at gcc dot gnu.org
 Status|UNCONFIRMED |ASSIGNED

--- Comment #4 from Eric Gallager  ---
m2rte just broke bootstrap for me, so I'm going to confirm this with a slightly
different title, and mark as ASSIGNED since there's an assignee

[PATCH] RISC-V: Enable Hoist to GCSE simple constants

2023-08-09 Thread Vineet Gupta
Hoist want_to_gcse_p () calls rtx_cost () to compute max distance for
hoist candidates . For a const with cost 1 backend currently returns 0,
causing Hoist to bail and elide GCSE.

Note that constants requiring more than 1 insns to setup were working
already since backend is returning 1 as well. Arguably that needs updating
as well to reflect cost better, but that should be a different change
anyways.

To keep testsuite parity, some V tests need updatinge which started failing
in the new costing regime.

gcc/ChangeLog:
* gcc/config/riscv.cc (riscv_rtx_costs): Adjust const_int cost.

gcc/testsuite/ChangeLog:
* gcc.target/riscv/gcse-const.c: New Test
* gcc.target/riscv/rvv/vsetvl/vlmax_conflict-7.c: Disable for
  -O2.
* gcc.target/riscv/rvv/vsetvl/vlmax_conflict-8.c: Ditto.

Signed-off-by: Vineet Gupta 
---
 gcc/config/riscv/riscv.cc   |  9 ++---
 gcc/testsuite/gcc.target/riscv/gcse-const.c | 13 +
 .../gcc.target/riscv/rvv/vsetvl/vlmax_conflict-7.c  |  2 +-
 .../gcc.target/riscv/rvv/vsetvl/vlmax_conflict-8.c  |  2 +-
 4 files changed, 17 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/gcse-const.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 8b7256108157..1802eef908fc 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -2464,14 +2464,9 @@ riscv_rtx_costs (rtx x, machine_mode mode, int 
outer_code, int opno ATTRIBUTE_UN
 case CONST:
   if ((cost = riscv_const_insns (x)) > 0)
{
- /* If the constant is likely to be stored in a GPR, SETs of
-single-insn constants are as cheap as register sets; we
-never want to CSE them.  */
+ /* Hoist will GCSE constants only if cost is non-zero.  */
  if (cost == 1 && outer_code == SET)
-   *total = 0;
- /* When we load a constant more than once, it usually is better
-to duplicate the last operation in the sequence than to CSE
-the constant itself.  */
+   *total = COSTS_N_INSNS (1);
  else if (outer_code == SET || GET_MODE (x) == VOIDmode)
*total = COSTS_N_INSNS (1);
}
diff --git a/gcc/testsuite/gcc.target/riscv/gcse-const.c 
b/gcc/testsuite/gcc.target/riscv/gcse-const.c
new file mode 100644
index ..b04707ce9745
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/gcse-const.c
@@ -0,0 +1,13 @@
+/* Slightly modified copy of gcc.target/arm/pr40956.c.  */
+/* { dg-options "-Os" }  */
+/* Make sure the constant "6" is loaded into register only once.  */
+/* { dg-final { scan-assembler-times "\tli.*6" 1 } } */
+
+int foo(int p, int* q)
+{
+  if (p!=9)
+*q = 6;
+  else
+*(q+1) = 6;
+  return 3;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_conflict-7.c 
b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_conflict-7.c
index 60ad108666f8..83618880c8ea 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_conflict-7.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_conflict-7.c
@@ -21,5 +21,5 @@ void f (int32_t * restrict in, int32_t * restrict out, size_t 
n, size_t cond, si
 }
 
 /* { dg-final { scan-assembler-times {vsetvli} 4 { target { no-opts "-O0"  
no-opts "-O1"  no-opts "-Os" no-opts "-Oz" no-opts "-funroll-loops" no-opts 
"-g" } } } } */
-/* { dg-final { scan-assembler-times {j\s+\.L[0-9]+\s+\.L[0-9]+:\s+vlm\.v} 1 { 
target { no-opts "-O0" no-opts "-O1"  no-opts "-Os" no-opts "-Oz" no-opts 
"-funroll-loops" no-opts "-g" } } } } */
+/* { dg-final { scan-assembler-times {j\s+\.L[0-9]+\s+\.L[0-9]+:\s+vlm\.v} 1 { 
target { no-opts "-O0" no-opts "-O1" no-opts "-O2" no-opts "-Os" no-opts "-Oz" 
no-opts "-funroll-loops" no-opts "-g" } } } } */
 /* { dg-final { scan-assembler-times 
{vsetvli\s+[a-x0-9]+,\s*zero,\s*e8,\s*m8,\s*t[au],\s*m[au]} 3 { target { 
no-opts "-O0" no-opts "-O1"  no-opts "-Os" no-opts "-Oz" no-opts 
"-funroll-loops" no-opts "-g" } } } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_conflict-8.c 
b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_conflict-8.c
index 7b9574cc332d..8aca839c49aa 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_conflict-8.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_conflict-8.c
@@ -21,7 +21,7 @@ void f (int32_t * restrict in, int32_t * restrict out, size_t 
n, size_t cond, si
 }
 
 /* { dg-final { scan-assembler-times {vsetvli} 5 { target { no-opts "-O0"  
no-opts "-O1"  no-opts "-Os" no-opts "-Oz" no-opts "-funroll-loops" no-opts 
"-g" } } } } */
-/* { dg-final { scan-assembler-times {j\s+\.L[0-9]+\s+\.L[0-9]+:\s+vlm\.v} 1 { 
target { no-opts "-O0" no-opts "-O1"  no-opts "-Os" no-opts "-Oz" no-opts 
"-funroll-loops" no-opts "-g" } } } } */
+/* { dg-final { scan-assembler-times {j\s+\.L[0-9]+\s+\.L[0-9]+:\s+vlm\.v} 1 { 
target { no-opts "-O0" no-opts "-O1" no-opts "-O2"  no-opts "-Os" no-opts "-Oz" 
no-opts "-funroll-loops" no-opts "-g" } } 

[Bug middle-end/110954] [14 Regression] Wrong code with -O0

2023-08-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110954

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||patch
URL||https://gcc.gnu.org/piperma
   ||il/gcc-patches/2023-August/
   ||626896.html

--- Comment #4 from Andrew Pinski  ---
Patch posted:
https://gcc.gnu.org/pipermail/gcc-patches/2023-August/626896.html

[PATCH] Fix PR 110954: wrong code with cmp | !cmp

2023-08-09 Thread Andrew Pinski via Gcc-patches
This was an oversight on my part not realizing that
comparisons in generic can have a non-boolean type.
This means if we have `(f < 0) | !(f < 0)` we would
optimize this to -1 rather than just 1.
This patch just adds the check for the type of the comparisons
to be boolean type to keep the optimization in that case.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR 110954

gcc/ChangeLog:

* generic-match-head.cc (bitwise_inverted_equal_p): Check
the type of the comparison to be boolean too.

gcc/testsuite/ChangeLog:

* gcc.c-torture/execute/pr110954-1.c: New test.
---
 gcc/generic-match-head.cc|  3 ++-
 gcc/testsuite/gcc.c-torture/execute/pr110954-1.c | 10 ++
 2 files changed, 12 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr110954-1.c

diff --git a/gcc/generic-match-head.cc b/gcc/generic-match-head.cc
index ddaf22f2179..ac2119bfdd0 100644
--- a/gcc/generic-match-head.cc
+++ b/gcc/generic-match-head.cc
@@ -146,7 +146,8 @@ bitwise_inverted_equal_p (tree expr1, tree expr2)
   && bitwise_equal_p (expr1, TREE_OPERAND (expr2, 0)))
 return true;
   if (COMPARISON_CLASS_P (expr1)
-  && COMPARISON_CLASS_P (expr2))
+  && COMPARISON_CLASS_P (expr2)
+  && TREE_CODE (TREE_TYPE (expr1)) == BOOLEAN_TYPE)
 {
   tree op10 = TREE_OPERAND (expr1, 0);
   tree op20 = TREE_OPERAND (expr2, 0);
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr110954-1.c 
b/gcc/testsuite/gcc.c-torture/execute/pr110954-1.c
new file mode 100644
index 000..8aad758e10f
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr110954-1.c
@@ -0,0 +1,10 @@
+
+#define comparison (f < 0)
+int main() {
+  int f = 0;
+  int d = comparison | !comparison;
+  if (d != 1)
+__builtin_abort();
+  return 0;
+}
+
-- 
2.31.1



RE: [PATCH] RISC-V: Fix VLMAX AVL incorrect local anticipate [VSETVL PASS]

2023-08-09 Thread Li, Pan2 via Gcc-patches
Committed to trunk, thanks Jeff.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Jeff Law via Gcc-patches
Sent: Thursday, August 10, 2023 1:01 AM
To: Juzhe-Zhong ; gcc-patches@gcc.gnu.org
Cc: kito.ch...@gmail.com; kito.ch...@sifive.com; rdapp@gmail.com
Subject: Re: [PATCH] RISC-V: Fix VLMAX AVL incorrect local anticipate [VSETVL 
PASS]



On 8/9/23 04:51, Juzhe-Zhong wrote:
> Realize we have a bug in VSETVL PASS which is triggered by 
> strided_load_run-1.c in RV32 system.
> 
> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c 
> execution test
> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c 
> execution test
> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c 
> execution test
> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c 
> execution test
> 
> This is because VSETVL PASS incorrect hoist vsetvl instruction:
> 
> ...
> 10156:0d9075d7vsetvli a1,zero,e64,m2,ta,ma ---> 
> pollute 'a1' register which will be used by following insns.
> 1015a:01d586b3add a3,a1,t4  > use 'a1'
> 1015e:5e070257vmv.v.v v4,v14
> 10162:b7032257vmacc.vvv4,v6,v16
> 10166:26440257vand.vv v4,v4,v8
> 1016a:22880227vs2r.v  v4,(a6)
> 1016e:00b6b7b3sltua5,a3,a1
> 10172:22888227vs2r.v  v4,(a7)
> 10176:9e60b157vmv2r.v v2,v6
> 1017a:97baadd a5,a5,a4
> 1017c:a6a62157vmadd.vvv2,v12,v10
> 10180:26240157vand.vv v2,v2,v8
> 10184:22830127vs2r.v  v2,(t1)
> 10188:873emv  a4,a5
> 1018a:982aadd a6,a6,a0
> 1018c:98aaadd a7,a7,a0
> 1018e:932aadd t1,t1,a0
> 10190:85b6mv  a1,a3   -> set 'a1'
> ...
> 
> gcc/ChangeLog:
> 
>  * config/riscv/riscv-vsetvl.cc (anticipatable_occurrence_p): Fix 
> incorrect anticipate info.
> 
> gcc/testsuite/ChangeLog:
> 
>  * gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c: 
> Adapt test.
>  * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-24.c: Ditto.
>  * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-25.c: Ditto.
>  * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-26.c: Ditto.
>  * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-36.c: Ditto.
>  * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-14.c: Ditto.
>  * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-15.c: Ditto.
OK.

Do we need to backport this to gcc-13?

jeff


[Bug tree-optimization/80641] missed optimization with with std::vector resize in loop

2023-08-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80641

--- Comment #16 from Andrew Pinski  ---
Note starting in GCC 13 at -O3, we are able to optimize this all the way to:
```
int f ()
{
  int * _54;

   [local count: 114863531]:
  _54 = operator new (16);
  MEM  [(char * {ref-all})_54] = 0x300020001;
  operator delete (_54, 16);
  return 0;

}
```
If we change the name of the function to f rather than main. In GCC 14 we able
to optimize even without the name change (there has been some inlining heurstic
changes there).

[Bug c/66425] (void) cast doesn't suppress __attribute__((warn_unused_result))

2023-08-09 Thread ed at catmur dot uk via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425

--- Comment #55 from Ed Catmur  ---
(In reply to Roman Krotov from comment #54)
[[nodiscard]] is in C23, so we can expect that attribute to be adopted where
people intend that behavior (warning suppressible by cast to void) as opposed
to the nonportable [[gnu::warn_unused_result]] (warning not suppressible by
cast to void). So this problem will resolve itself, over time.

[Bug tree-optimization/66678] loop counter not accurately described by vrp

2023-08-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66678

--- Comment #4 from Andrew Pinski  ---
So in GCC 12+ after evrp
  # RANGE [0, 4294967294] NONZERO 4294967295
  _1 = (long unsigned intD.16) i_9;
  # RANGE [0, 17179869176] NONZERO 17179869180
  _2 = _1 * 4;
...
  # RANGE [1, 4294967295]
  i_17 = i_9 + 1;

So maybe this has been fix ...

[Bug libgcc/110956] [13/14 regression] gcc_assert is hit at gcc-13.2.0/libgcc/unwind-dw2-fde.c#L291 with some special library

2023-08-09 Thread tneumann at users dot sourceforge.net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110956

--- Comment #7 from Thomas Neumann  ---
Thanks for the pointer, I could reproduce the problem in a VM now.

That shared library uses an usual table encoding that has to reference the
original base pointer within get_pc_range. But when deregistering a frame we
simply set the base pointer to nullptr, which does not work here.

I will write a patch that makes sure we always have to correct base pointer
available.

Re: [RFC PATCH 0/2] RISC-V: __builtin_riscv_pause for all environment

2023-08-09 Thread Tsukasa OI via Gcc-patches
On 2023/08/10 5:05, Jeff Law wrote:
> 
> 
> On 8/9/23 00:11, Tsukasa OI via Gcc-patches wrote:
>> Hello,
>>
>> I found that a built-in function "__builtin_riscv_pause" is not usable
>> unless we enable the 'Zihintpause' extension explicitly (still, this
>> built-in exists EVEN IF the 'Zihintpause' extension is disabled).
>>
>> Contents of a.c:
>>
>>> void sample(void)
>>> {
>>>  __builtin_riscv_pause();
>>> }
>>
>>
>> Compiling with the 'Zihintpause' extension is fine.
>>
>>> $ riscv64-unknown-elf-gcc -O2 -march=rv64i_zihintpause -mabi=lp64 -c a.c
>>
>>
>> However, compiling without the 'Zihintpause' causes an assembler error
>> (tested with GNU Binutils 2.41):
>>
>>> $ riscv64-unknown-elf-gcc -O2 -march=rv64i -mabi=lp64 -c a.c
>>> /tmp/ccFjacAz.s: Assembler messages:
>>> /tmp/ccFjacAz.s:11: Error: unrecognized opcode `pause', extension
>>> `zihintpause' required
>>
>>
>> This is because:
>>
>> 1.  GCC does not handle the 'Zihintpause' extension and
>> 2.  "riscv_pause" (insn) unconditionally emits "pause" even if the
>>  assembler does not accept it (when the extension is disabled).
>>
>>
>> This patch set (PATCH 1/2) resolves this issue by:
>>
>> 1.  Handling the 'Zihintpause' extension and
>> 2.  Splitting the "__builtin_riscv_pause" implementation
>>  depending on the existence of the 'Zihintpause' extension.
>>
>> Because a released version of GCC defines "__builtin_riscv_pause"
>> unconditionally, I chose to define no-'Zihintpause' version.
>>
>> There is another option to unconditionally emit ".insn 0x010f"
>> (the machine code of "pause") but I didn't because I wanted to improve
>> the
>> diagnostics (e.g. *.s output).
>>
>> I also fixed the description of this built-in function (in PATCH 2/2).
>>
>>
>> I'm not sure whether this is a good method to split the implementation
>> depending on the 'Zihintpause' extension.  Other than that, I believe
>> that
>> this is okay and approval is appreciated.
>>
>> Note that because I assigned copyright of GCC contribution to FSF, I
>> didn't
>> attach "Signed-off-by" tag.  Tell me if I need it.
> I'd tend to think we do not want to expose the intrinsic unless the
> right extensions are enabled -- even though the encoding is a no-op and
> we could emit it as a .insn.

I think that makes sense.  The only reason I implemented the
no-'Zihintpause' version is because GCC 13 implemented the built-in
unconditionally.  If the compatibility breakage is considered minimum (I
don't know, though), I'm ready to submit 'Zihintpause'-only version of
this patch set.

Thanks,
Tsukasa

> 
> If others think otherwise, I'll go with the consensus opinion.  So let's
> hold off a bit until others have chimed in.
> 
> Thanks,
> jeff
> 


[Bug c/66425] (void) cast doesn't suppress __attribute__((warn_unused_result))

2023-08-09 Thread romato.san1337 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425

Roman Krotov  changed:

   What|Removed |Added

 CC||romato.san1337 at gmail dot com

--- Comment #54 from Roman Krotov  ---
I'm also very frustrated about this behavior, but I'm not asking to **change**
it. I see your point and, although I think it's severely outdated now, I
respect your decision.

Instead I'm offering a compromise, or to better describe it, a win-win
solution.
What about introducing something like `-Wno-void-unused`, which would only
disable the warnings for the (void) casted calls?

And I'm not asking to do it for me, or some other random dudes; do it for
systemd, they are also very frustrated about this:
https://github.com/systemd/systemd/blob/5a96b32dead5132ba37a8b968c59101c2aac8d23/meson.build#L400

That's because they use (void) casts very frequently:
https://github.com/search?q=repo%3Asystemd%2Fsystemd+language%3AC+%22%28void%29%22=code

You can probably say they are in the wrong by doing the bad practice and should
do a macro instead.
And you may be right. I really appreciate your commitment to teach people good
practices, but **suggesting** them (by setting the default behavior which
**recommends** the programmers to use these practices) should be enough.
I don't think it's a good idea to **force** people to use them, by either
spamming them with warnings (useless to them, because they already made a
decision) or making them disable such warnings altogether, which could lead
them to miss a bug where they forgot to use a value of a function (which they
didn't cast with void).

You may say it's their fault, but let me ask you a question, who is winning
from the current situation? Certainly, not you, because you didn't convince
them to change, instead you just made them not listen to you, see also
julia-language and mpv:
https://github.com/search?q=%22-Wno-unused-result%22=code

And certainly, not them, because of the bug-risk I typed above, which is really
unfortunate for such a big and widely-used project as systemd.
Also, making them switch to clang (or drop gcc support) would be really bad,
because nothing hurts FOSS and frustrates me in a project more than a
permissive license. Eww..

Sorry, if I was a little bit harsh there, I really appreciate your contribution
to FOSS and Linux development, and I would never switch to clang.
Just, please note, that, again, I'm begging you to just implement a non-default
switch that would make everyone's lives easier.

Why "everyone's"? You see,

If a new programmer encounters this "problem", they would think it's just a
"gcc bug", because "clang works as expected" (and also the mentioned projects
wrote like this in a comment), so they just disable the warnings altogether.
Nobody won, especially the programmer.
Now consider the situation where you implemented the switch, but wrote, for
example, in the manpage just that "It's not recommended". Now they will be
curious why, so they will try to find an answer, and the search will probably
lead them to the gnu.org page which has the answer.

See, how even though you might think you gave the ability to write a bad code,
in reality you would give the exact point why not to use (void)? (Which they
really need because other compilers/linters are trying to convince them
otherwise.)
Come on, at least please admit that the situation described above would still
be better than today's.

If you will find at least some of my points somewhat reasonable, please share
my comment with the people that can help bring this improvement to life.

[Bug middle-end/48580] missed optimization: integer overflow checks

2023-08-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48580

--- Comment #25 from Andrew Pinski  ---
We have this now:
  if (tmp.3_3 > 0)
goto ; [59.00%]
  else
goto ; [41.00%]

   [local count: 633507679]:
  _10 = _12 == 0;

   [local count: 1073741824]:
  # iftmp.2_5 = PHI <_10(3), 0(2)>

I suspect what we could do is in isel change that to:
iftmp.2_5 = tmp.3_3 > 0 ? _12 == 0 : 0;
(which itself gets optimized to:
iftmp.2_5 = (tmp.3_3 > 0) & (_12 == 0)

Which should produce better code at least.

[Bug tree-optimization/83247] simplify (int)a_long < 0 when we know a_long fits in int

2023-08-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83247

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org
 CC||pinskia at gcc dot gnu.org

--- Comment #2 from Andrew Pinski  ---
I thought simplify_casted_compare in vr-values.cc was supposed to do that but
it looks like it is not.

Let me try to see how to fix simplify_casted_compare .

[PATCH v4 7/8] diagnostics: libcpp: Assign real locations to the tokens inside _Pragma strings

2023-08-09 Thread Lewis Hyatt via Gcc-patches
Currently, the tokens obtained from a destringified _Pragma string do not get
assigned proper locations while they are being lexed.  After the tokens have
been obtained, they are reassigned the same location as the _Pragma token,
which is sufficient to make things like _Pragma("GCC diagnostic ignored...")
operate correctly, but this still results in inferior diagnostics, since the
diagnostics do not point to the problematic tokens.  Further, if a diagnostic
is issued by libcpp during the lexing of the tokens, as opposed to being
issued by the frontend during the processing of the pragma, then the
patched-up location is not yet in place, and the user rather sees an invalid
location that is near to the location of the _Pragma string in some cases, or
potentially very far away, depending on the macro expansion history.  For
example:

=
_Pragma("GCC diagnostic ignored \"oops")
=

produces the diagnostic:

file.cpp:1:24: warning: missing terminating " character
1 | _Pragma("GCC diagnostic ignored \"oops")
  |^

with the caret in a nonsensical location, while this one:

=
 #define S "GCC diagnostic ignored \"oops"
_Pragma(S)
=

produces:

file.cpp:2:24: warning: missing terminating " character
2 | _Pragma(S)
  |^

with both the caret in a nonsensical location, and the actual relevant context
completely absent.

Fix this by assigning proper locations using the new LC_GEN type of linemap.
Now the tokens are given locations inside a generated content buffer, and the
macro expansion stack is modified to be aware that these tokens logically
belong to the "expansion" of the _Pragma directive. For the above examples we
now output:

==
In buffer generated from file.cpp:1:
:1:24: warning: missing terminating " character
1 | GCC diagnostic ignored "oops
  |^
file.cpp:1:1: note: in <_Pragma directive>
1 | _Pragma("GCC diagnostic ignored \"oops")
  | ^~~
==

and

==
:1:24: warning: missing terminating " character
1 | GCC diagnostic ignored "oops
  |^
file.cpp:2:1: note: in <_Pragma directive>
2 | _Pragma(S)
  | ^~~
==

So that carets are pointing to something meaningful and all relevant context
appears in the diagnostic.  For the second example, it would be nice if the
macro expansion also output "in expansion of macro S", however doing that for
a general case of macro expansions makes the logic very complicated, since it
has to be done after the fact when the macro maps have already been
constructed.  It doesn't seem worth it for this case, given that the _Pragma
string has already been output once on the first line.

gcc/ChangeLog:

* tree-diagnostic.cc (maybe_unwind_expanded_macro_loc): Add awareness
of _Pragma directive to the macro expansion trace.

libcpp/ChangeLog:

* directives.cc (get_token_no_padding): Add argument to receive the
virtual location of the token.
(get__Pragma_string): Likewise.
(do_pragma): Set pfile->directive_result->src_loc properly, it should
not be a virtual location.
(destringize_and_run): Update to provide proper locations for the
_Pragma string tokens.  Support raw strings.
(_cpp_do__Pragma): Adapt to changes to the helper functions.
* errors.cc (cpp_diagnostic_at): Support
cpp_reader::diagnostic_rebase_loc.
(cpp_diagnostic_with_line): Likewise.
* include/line-map.h (class rich_location): Add new member
forget_cached_expanded_locations().
* internal.h (struct _cpp__Pragma_state): Define new struct.
(_cpp_rebase_diagnostic_location): Declare new function.
(struct cpp_reader): Add diagnostic_rebase_loc member.
(_cpp_push__Pragma_token_context): Declare new function.
(_cpp_do__Pragma): Adjust prototype.
* macro.cc (pragma_str): New static var.
(builtin_macro): Adapt to new implementation of _Pragma processing.
(_cpp_pop_context): Fix the logic for resetting
pfile->top_most_macro_node, which previously was never triggered,
although the error seems to have been harmless.
(_cpp_push__Pragma_token_context): New function.
(_cpp_rebase_diagnostic_location): New function.

gcc/c-family/ChangeLog:

* c-ppoutput.cc (token_streamer::stream): Pass the virtual location of
the _Pragma token to maybe_print_line(), not the spelling location.

libgomp/ChangeLog:

* testsuite/libgomp.oacc-c-c++-common/reduction-5.c: Adjust for new
macro tracking output for _Pragma directives.
* testsuite/libgomp.oacc-c-c++-common/vred2d-128.c: Likewise.

gcc/testsuite/ChangeLog:

* c-c++-common/cpp/diagnostic-pragma-1.c: Adjust for new macro
tracking output for _Pragma directives.
* c-c++-common/cpp/pr57580.c: Likewise.
* c-c++-common/gomp/pragma-3.c: Likewise.

[Bug tree-optimization/95643] Optimizer fails to realize that a variable tested twice in a row is the same both times

2023-08-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95643

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |11.0
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=95757

[PATCH v4 6/8] diagnostics: Full support for generated data locations

2023-08-09 Thread Lewis Hyatt via Gcc-patches
Previous patches in this series have laid the groundwork for supporting
source code locations in memory ("generated data") rather than ordinary
files. This patch completes the support by adding awareness of such
locations to all places that need to support them. The main changes are to
diagnostic-show-locus.cc; the others are primarily small tweaks such as
changing from the FILE to the SRC member when inspecting an
expanded_location.

gcc/c-family/ChangeLog:

* c-format.cc (get_corrected_substring): Use the new overload of
location_get_source_line() to support generated data.
* c-indentation.cc (get_visual_column): Likewise.
(get_first_nws_vis_column): Change argument from a plain file name
to a source_id.
(detect_intervening_unindent): Likewise.
(should_warn_for_misleading_indentation): Pass
detect_intervening_unindent() the SRC field rather than the FILE
field from the expanded_location.

gcc/ChangeLog:

* gcc-rich-location.cc (blank_line_before_p): Use the new overload
of location_get_source_line() to support generated data.
* input.cc (get_source_text_between): Likewise.
(get_substring_ranges_for_loc): Likewise.
(get_source_file_content): Change the argument from a plain filename
to a source_id.
(location_missing_trailing_newline): Likewise.
* input.h (get_source_file_content): Adjust prototype.
(location_missing_trailing_newline): Likewise.
* diagnostic-show-locus.cc (layout::calculate_x_offset_display): Use
the new overload of location_get_source_line() to support generated
data.
(layout::print_line): Likewise.
(class line_corrections): Change m_filename from a plain filename to
a source_id.
(source_line::source_line): Change argument from a plain filename to
a source_id.
(line_corrections::add_hint): Adapt to source_line change.
(layout::print_trailing_fixits): Adapt to line_corrections change.
(test_layout_x_offset_display_utf8): Test generated data too.
(test_layout_x_offset_display_tab): Likewise.
(test_diagnostic_show_locus_one_liner): Likewise.
(test_diagnostic_show_locus_one_liner_utf8): Likewise.
(test_add_location_if_nearby): Likewise.
(test_diagnostic_show_locus_fixit_lines): Likewise.
(test_fixit_consolidation): Likewise.
(test_overlapped_fixit_printing): Likewise.
(test_overlapped_fixit_printing_utf8): Likewise.
(test_overlapped_fixit_printing_2): Likewise.
(test_fixit_insert_containing_newline): Likewise.
(test_fixit_insert_containing_newline_2): Likewise.
(test_fixit_replace_containing_newline): Likewise.
(test_fixit_deletion_affecting_newline): Likewise.
(test_tab_expansion): Likewise.
(test_escaping_bytes_1): Likewise.
(test_escaping_bytes_2): Likewise.
(test_line_numbers_multiline_range): Likewise.
(diagnostic_show_locus_cc_tests): Likewise.
---
 gcc/c-family/c-format.cc  |   2 +-
 gcc/c-family/c-indentation.cc |   8 +-
 gcc/diagnostic-show-locus.cc  | 227 ++
 gcc/gcc-rich-location.cc  |   2 +-
 gcc/input.cc  |  21 ++--
 gcc/input.h   |   6 +-
 6 files changed, 136 insertions(+), 130 deletions(-)

diff --git a/gcc/c-family/c-format.cc b/gcc/c-family/c-format.cc
index 529b1408179..929ec24622c 100644
--- a/gcc/c-family/c-format.cc
+++ b/gcc/c-family/c-format.cc
@@ -4537,7 +4537,7 @@ get_corrected_substring (const substring_loc _loc,
   if (caret.column > finish.column)
 return NULL;
 
-  char_span line = location_get_source_line (start.file, start.line);
+  char_span line = location_get_source_line (start);
   if (!line)
 return NULL;
 
diff --git a/gcc/c-family/c-indentation.cc b/gcc/c-family/c-indentation.cc
index fce74991aae..27a90d9cc15 100644
--- a/gcc/c-family/c-indentation.cc
+++ b/gcc/c-family/c-indentation.cc
@@ -50,7 +50,7 @@ get_visual_column (expanded_location exploc,
   unsigned int *first_nws,
   unsigned int tab_width)
 {
-  char_span line = location_get_source_line (exploc.file, exploc.line);
+  char_span line = location_get_source_line (exploc);
   if (!line)
 return false;
   if ((size_t)exploc.column > line.length ())
@@ -87,7 +87,7 @@ get_visual_column (expanded_location exploc,
Otherwise, return false, leaving *FIRST_NWS untouched.  */
 
 static bool
-get_first_nws_vis_column (const char *file, int line_num,
+get_first_nws_vis_column (source_id file, int line_num,
  unsigned int *first_nws,
  unsigned int tab_width)
 {
@@ -158,7 +158,7 @@ get_first_nws_vis_column (const char *file, int line_num,
Return true if such an unindent/outdent is detected.  */
 
 static bool
-detect_intervening_unindent (const char *file,

[Bug tree-optimization/95643] Optimizer fails to realize that a variable tested twice in a row is the same both times

2023-08-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95643

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Andrew Pinski  ---
Fixed for GCC 11 by r11-7448-gff92ede8d269375f800e1 .

[PATCH v4 5/8] diagnostics: Support testing generated data in input.cc selftests

2023-08-09 Thread Lewis Hyatt via Gcc-patches
Add selftests for the new capabilities in input.cc related to source code
locations that are stored in memory rather than ordinary files.

gcc/ChangeLog:

* input.cc (temp_source_file::do_linemap_add): New function.
(line_table_case::line_table_case): Add GENERATED_DATA argument.
(line_table_test::line_table_test): Implement new M_GENERATED_DATA
argument.
(for_each_line_table_case): Optionally include generated data
locations in the set of cases.
(test_accessing_ordinary_linemaps): Test generated data locations.
(test_make_location_nonpure_range_endpoints): Likewise.
(test_line_offset_overflow): Likewise.
(input_cc_tests): Likewise.
* selftest.cc (named_temp_file::named_temp_file): Interpret a null
SUFFIX argument as a request to use in-memory data.
(named_temp_file::~named_temp_file): Support in-memory data.
(temp_source_file::temp_source_file): Likewise.
(temp_source_file::~temp_source_file): Likewise.
* selftest.h (struct line_map_ordinary): Foward declare.
(class named_temp_file): Add missing explicit to the constructor.
(class temp_source_file): Add new members to support in-memory data.
(class line_table_test): Likewise.
(for_each_line_table_case): Adjust prototype.
---
 gcc/input.cc| 81 +
 gcc/selftest.cc | 53 +---
 gcc/selftest.h  | 19 ++--
 3 files changed, 113 insertions(+), 40 deletions(-)

diff --git a/gcc/input.cc b/gcc/input.cc
index 790279d4273..8c4e40aaf23 100644
--- a/gcc/input.cc
+++ b/gcc/input.cc
@@ -2066,6 +2066,20 @@ get_num_source_ranges_for_substring (cpp_reader *pfile,
 
 /* Selftests of location handling.  */
 
+/* Wrapper around linemap_add to handle transparently adding either a tmp file,
+   or in-memory generated content.  */
+const line_map_ordinary *
+temp_source_file::do_linemap_add (int line)
+{
+  const line_map *map;
+  if (content_buf)
+map = linemap_add (line_table, LC_GEN, false, content_buf,
+  line, content_len);
+  else
+map = linemap_add (line_table, LC_ENTER, false, get_filename (), line);
+  return linemap_check_ordinary (map);
+}
+
 /* Verify that compare() on linenum_type handles comparisons over the full
range of the type.  */
 
@@ -2144,13 +2158,16 @@ assert_loceq (const char *exp_filename, int 
exp_linenum, int exp_colnum,
 class line_table_case
 {
 public:
-  line_table_case (int default_range_bits, int base_location)
+  line_table_case (int default_range_bits, int base_location,
+  bool generated_data)
   : m_default_range_bits (default_range_bits),
-m_base_location (base_location)
+m_base_location (base_location),
+m_generated_data (generated_data)
   {}
 
   int m_default_range_bits;
   int m_base_location;
+  bool m_generated_data;
 };
 
 /* Constructor.  Store the old value of line_table, and create a new
@@ -2167,6 +2184,7 @@ line_table_test::line_table_test ()
   gcc_assert (saved_line_table->round_alloc_size);
   line_table->round_alloc_size = saved_line_table->round_alloc_size;
   line_table->default_range_bits = 0;
+  m_generated_data = false;
 }
 
 /* Constructor.  Store the old value of line_table, and create a new
@@ -2188,6 +2206,7 @@ line_table_test::line_table_test (const line_table_case 
_)
   line_table->highest_location = case_.m_base_location;
   line_table->highest_line = case_.m_base_location;
 }
+  m_generated_data = case_.m_generated_data;
 }
 
 /* Destructor.  Restore the old value of line_table.  */
@@ -2207,7 +2226,10 @@ test_accessing_ordinary_linemaps (const line_table_case 
_)
   line_table_test ltt (case_);
 
   /* Build a simple linemap describing some locations. */
-  linemap_add (line_table, LC_ENTER, false, "foo.c", 0);
+  if (ltt.m_generated_data)
+linemap_add (line_table, LC_GEN, false, "some data", 0, 10);
+  else
+linemap_add (line_table, LC_ENTER, false, "foo.c", 0);
 
   linemap_line_start (line_table, 1, 100);
   location_t loc_a = linemap_position_for_column (line_table, 1);
@@ -2257,21 +2279,23 @@ test_accessing_ordinary_linemaps (const line_table_case 
_)
   linemap_add (line_table, LC_LEAVE, false, NULL, 0);
 
   /* Verify that we can recover the location info.  */
-  assert_loceq ("foo.c", 1, 1, loc_a);
-  assert_loceq ("foo.c", 1, 23, loc_b);
-  assert_loceq ("foo.c", 2, 1, loc_c);
-  assert_loceq ("foo.c", 2, 17, loc_d);
-  assert_loceq ("foo.c", 3, 700, loc_e);
-  assert_loceq ("foo.c", 4, 100, loc_back_to_short);
+  const auto fname
+= (ltt.m_generated_data ? special_fname_generated () : "foo.c");
+  assert_loceq (fname, 1, 1, loc_a);
+  assert_loceq (fname, 1, 23, loc_b);
+  assert_loceq (fname, 2, 1, loc_c);
+  assert_loceq (fname, 2, 17, loc_d);
+  assert_loceq (fname, 3, 700, loc_e);
+  assert_loceq (fname, 4, 100, loc_back_to_short);
 
   /* In the very wide line, the 

[PATCH v4 3/8] diagnostics: Refactor class file_cache_slot

2023-08-09 Thread Lewis Hyatt via Gcc-patches
Class file_cache_slot in input.cc is used to query specific lines of source
code from a file when needed by diagnostics infrastructure. This will be
extended in a subsequent patch to support obtaining the source code from
in-memory generated buffers rather than from a file. The present patch
refactors class file_cache_slot, putting most of the logic into a new base
class cache_data_source, in preparation for reusing that code in the next
patch. There is no change in functionality yet.

gcc/ChangeLog:

* input.cc (class file_cache_slot): Refactor functionality into a
new base class...
(class cache_data_source): ...here.
(file_cache::forcibly_evict_file): Adapt for refactoring.
(file_cache_slot::evict): Renamed to...
(file_cache_slot::reset): ...this, and partially refactored into
base class...
(cache_data_source::reset): ...here.
(file_cache_slot::get_full_file_content): Moved into base class...
(cache_data_source::get_full_file_content): ...here.
(file_cache_slot::create): Adapt for refactoring.
(file_cache_slot::file_cache_slot): Refactor partially into...
(cache_data_source::cache_data_source): ...here.
(file_cache_slot::~file_cache_slot): Refactor partially into...
(cache_data_source::~cache_data_source): ...here.
(file_cache_slot::needs_read_p): Remove.
(file_cache_slot::needs_grow_p): Remove.
(file_cache_slot::maybe_grow): Adapt for refactoring.
(file_cache_slot::read_data): Refactored, along with...
(file_cache_slot::maybe_read_data): this, into...
(file_cache_slot::get_more_data): ...here.
(find_end_of_line): Change interface to take a pair of pointers,
rather than a pointer + length.
(file_cache_slot::get_next_line): Refactored into...
(cache_data_source::get_next_line): ...here.
(file_cache_slot::goto_next_line): Refactored into...
(cache_data_source::goto_next_line): ...here.
(file_cache_slot::read_line_num): Refactored into...
(cache_data_source::read_line_num): ...here.
(location_get_source_line): Fix const-correctness as necessitated by
new interface.
---
 gcc/input.cc | 513 +++
 1 file changed, 235 insertions(+), 278 deletions(-)

diff --git a/gcc/input.cc b/gcc/input.cc
index c2559614a99..9377020b460 100644
--- a/gcc/input.cc
+++ b/gcc/input.cc
@@ -55,34 +55,88 @@ file_cache::initialize_input_context 
(diagnostic_input_charset_callback ccb,
   in_context.should_skip_bom = should_skip_bom;
 }
 
-/* This is a cache used by get_next_line to store the content of a
-   file to be searched for file lines.  */
-class file_cache_slot
+/* This is an abstract interface for a class that provides data which we want 
to
+   look up by line number.  Concrete implementations will follow, which handle
+   the cases of reading the data from the input source files, or of reading it
+   from in-memory generated data buffers.  The design is driven with reading
+   from files in mind, in particular it is desirable to read only as much of a
+   file from disk as necessary.  It works like a simplified std::istream, i.e.
+   virtual function calls are only needed when we need to retrieve more data
+   from the underlying source.  */
+
+class cache_data_source
 {
-public:
-  file_cache_slot ();
-  ~file_cache_slot ();
 
-  bool read_line_num (size_t line_num,
- char ** line, ssize_t *line_len);
-
-  /* Accessors.  */
-  const char *get_file_path () const { return m_file_path; }
+public:
+  bool read_line_num (size_t line_num, const char **line, ssize_t *line_len);
   unsigned get_use_count () const { return m_use_count; }
+  void inc_use_count () { m_use_count++; }
+  bool get_next_line (const char **line, ssize_t *line_len);
+  bool goto_next_line ();
   bool missing_trailing_newline_p () const
   {
 return m_missing_trailing_newline;
   }
   char_span get_full_file_content ();
+  bool unused () const { return !m_data_begin; }
+  virtual void reset ();
+
+protected:
+  cache_data_source ();
+  virtual ~cache_data_source ();
+
+  /* These pointers delimit the data that we are processing.  They are
+ maintained by the derived classes, we only ask for more by calling
+ get_more_data().  That function should return TRUE if more data was
+ obtained.  Calling get_more_data () may invalidate these pointers
+ (i.e. reallocating them to a larger buffer).  */
+  const char *m_data_begin;
+  const char *m_data_end;
+  virtual bool get_more_data () = 0;
+
+  /* This is to be called by the derived classes when this object is
+ being activated.  */
+  void on_create (unsigned int use_count, size_t total_lines)
+  {
+m_use_count = use_count;
+m_total_lines = total_lines;
+  }
 
-  void inc_use_count () { m_use_count++; }
+private:
+  /* Non-copyable.  */
+  cache_data_source (const 

[PATCH v4 4/8] diagnostics: Support obtaining source code lines from generated data buffers

2023-08-09 Thread Lewis Hyatt via Gcc-patches
This patch enhances location_get_source_line(), which is the primary
interface provided by the diagnostics infrastructure to obtain the line of
source code corresponding to a given location, so that it understands
generated data locations in addition to normal file-based locations. This
involves changing the argument to location_get_source_line() from a plain
file name, to a source_id object that can represent either type of location.

gcc/ChangeLog:

* input.cc (class data_cache_slot): New class.
(file_cache::lookup_data): New function.
(diagnostics_file_cache_forcibly_evict_data): New function.
(file_cache::forcibly_evict_data): New function.
(file_cache::evicted_cache_tab_entry): Generalize (via a template)
to work for both file_cache_slot and data_cache_slot.
(file_cache::add_file): Adapt for new interface to
evicted_cache_tab_entry.
(file_cache::add_data): New function.
(data_cache_slot::create): New function.
(file_cache::file_cache): Support the new m_data_slots member.
(file_cache::~file_cache): Likewise.
(file_cache::lookup_or_add_data): New function.
(file_cache::lookup_or_add): New function that calls either
lookup_or_add_data or lookup_or_add_file as appropriate.
(location_get_source_line): Change the FILE_PATH argument to a
source_id SRC, and use it to support obtaining source lines from
generated data as well as from files.
(location_compute_display_column): Support generated data using the
new features of location_get_source_line.
(dump_location_info): Likewise.
* input.h (location_get_source_line): Adjust prototype. Add a new
convenience overload taking an expanded_location.
(class cache_data_source): Declare.
(class data_cache_slot): Declare.
(class file_cache): Declare new members.
(diagnostics_file_cache_forcibly_evict_data): Declare.
---
 gcc/input.cc | 171 ---
 gcc/input.h  |  23 +--
 2 files changed, 153 insertions(+), 41 deletions(-)

diff --git a/gcc/input.cc b/gcc/input.cc
index 9377020b460..790279d4273 100644
--- a/gcc/input.cc
+++ b/gcc/input.cc
@@ -207,6 +207,28 @@ private:
   void maybe_grow ();
 };
 
+/* This is the implementation of cache_data_source for generated
+   data that is already in memory.  */
+class data_cache_slot final : public cache_data_source
+{
+public:
+  void create (const char *data, unsigned int data_len,
+  unsigned int highest_use_count);
+  bool represents_data (const char *data, unsigned int) const
+  {
+/* We can just use pointer equality here since the generated data lives in
+   memory in one persistent place.  It isn't anticipated there would be
+   several generated data buffers with the same content, so we don't mind
+   that in such a case we will store it twice.  */
+return m_data_begin == data;
+  }
+
+protected:
+  /* In contrast to file_cache_slot, we do not own a buffer.  The buffer
+ passed to create() needs to outlive this object.  */
+  bool get_more_data () override { return false; }
+};
+
 /* Current position in real source file.  */
 
 location_t input_location = UNKNOWN_LOCATION;
@@ -382,6 +404,21 @@ file_cache::lookup_file (const char *file_path)
   return r;
 }
 
+data_cache_slot *
+file_cache::lookup_data (const char *data, unsigned int data_len)
+{
+  for (unsigned int i = 0; i != num_file_slots; ++i)
+{
+  const auto slot = m_data_slots + i;
+  if (slot->represents_data (data, data_len))
+   {
+ slot->inc_use_count ();
+ return slot;
+   }
+}
+  return nullptr;
+}
+
 /* Purge any mention of FILENAME from the cache of files used for
printing source code.  For use in selftests when working
with tempfiles.  */
@@ -397,6 +434,15 @@ diagnostics_file_cache_forcibly_evict_file (const char 
*file_path)
   global_dc->m_file_cache->forcibly_evict_file (file_path);
 }
 
+void
+diagnostics_file_cache_forcibly_evict_data (const char *data,
+   unsigned int data_len)
+{
+  if (!global_dc->m_file_cache)
+return;
+  global_dc->m_file_cache->forcibly_evict_data (data, data_len);
+}
+
 void
 file_cache::forcibly_evict_file (const char *file_path)
 {
@@ -410,36 +456,36 @@ file_cache::forcibly_evict_file (const char *file_path)
   r->reset ();
 }
 
+void
+file_cache::forcibly_evict_data (const char *data, unsigned int data_len)
+{
+  if (auto r = lookup_data (data, data_len))
+r->reset ();
+}
+
 /* Return the cache that has been less used, recently, or the
first empty one.  If HIGHEST_USE_COUNT is non-null,
*HIGHEST_USE_COUNT is set to the highest use count of the entries
in the cache table.  */
 
-file_cache_slot*
-file_cache::evicted_cache_tab_entry (unsigned *highest_use_count)
+template 
+Slot *
+file_cache::evicted_cache_tab_entry (Slot 

[PATCH v4 8/8] diagnostics: Support generated data locations in SARIF output

2023-08-09 Thread Lewis Hyatt via Gcc-patches
The diagnostics routines for SARIF output need to read the source code back
in, so that they can generate "snippet" and "content" records, so they need to
be able to cope with generated data locations.  Add support for that in
diagnostic-format-sarif.cc.

gcc/ChangeLog:

* diagnostic-format-sarif.cc (class sarif_builder): Adapt interface
to support generated data locations.
(sarif_builder::maybe_make_physical_location_object): Change the
m_filenames hash_set to support generated data.
(sarif_builder::make_artifact_location_object): Use a source_id rather
than a plain file name.
(sarif_builder::maybe_make_region_object): Adapt to
expanded_location interface changes.
(sarif_builder::maybe_make_region_object_for_context): Likewise.
(sarif_builder::make_artifact_object): Likewise.
(sarif_builder::make_run_object): Handle generated data.
(sarif_builder::maybe_make_artifact_content_object): Likewise.
(get_source_lines): Likewise.

gcc/testsuite/ChangeLog:

* c-c++-common/diagnostic-format-sarif-file-5.c: New test.
---
 gcc/diagnostic-format-sarif.cc| 88 +++
 .../diagnostic-format-sarif-file-5.c  | 31 +++
 2 files changed, 82 insertions(+), 37 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/diagnostic-format-sarif-file-5.c

diff --git a/gcc/diagnostic-format-sarif.cc b/gcc/diagnostic-format-sarif.cc
index 1eff71962d7..c7c0e5d4b0a 100644
--- a/gcc/diagnostic-format-sarif.cc
+++ b/gcc/diagnostic-format-sarif.cc
@@ -174,7 +174,7 @@ private:
   json::array *maybe_make_kinds_array (diagnostic_event::meaning m) const;
   json::object *maybe_make_physical_location_object (location_t loc);
   json::object *make_artifact_location_object (location_t loc);
-  json::object *make_artifact_location_object (const char *filename);
+  json::object *make_artifact_location_object (source_id src);
   json::object *make_artifact_location_object_for_pwd () const;
   json::object *maybe_make_region_object (location_t loc) const;
   json::object *maybe_make_region_object_for_context (location_t loc) const;
@@ -197,9 +197,9 @@ private:
   json::object *make_reporting_descriptor_object_for_cwe_id (int cwe_id) const;
   json::object *
   make_reporting_descriptor_reference_object_for_cwe_id (int cwe_id);
-  json::object *make_artifact_object (const char *filename);
-  json::object *maybe_make_artifact_content_object (const char *filename) 
const;
-  json::object *maybe_make_artifact_content_object (const char *filename,
+  json::object *make_artifact_object (source_id src);
+  json::object *maybe_make_artifact_content_object (source_id src) const;
+  json::object *maybe_make_artifact_content_object (source_id src,
int start_line,
int end_line) const;
   json::object *make_fix_object (const rich_location _loc);
@@ -220,7 +220,11 @@ private:
  diagnostic group.  */
   sarif_result *m_cur_group_result;
 
-  hash_set  m_filenames;
+  /* If the second member is >0, then this is a buffer of generated content,
+ with that length, not a filename.  */
+  hash_set ,
+  int_hash  >
+   > m_filenames;
   bool m_seen_any_relative_paths;
   hash_set  m_rule_id_set;
   json::array *m_rules_arr;
@@ -787,7 +791,8 @@ sarif_builder::maybe_make_physical_location_object 
(location_t loc)
   /* "artifactLocation" property (SARIF v2.1.0 section 3.29.3).  */
   json::object *artifact_loc_obj = make_artifact_location_object (loc);
   phys_loc_obj->set ("artifactLocation", artifact_loc_obj);
-  m_filenames.add (LOCATION_FILE (loc));
+  const auto src = LOCATION_SRC (loc);
+  m_filenames.add ({src.get_filename_or_buffer (), src.get_buffer_len ()});
 
   /* "region" property (SARIF v2.1.0 section 3.29.4).  */
   if (json::object *region_obj = maybe_make_region_object (loc))
@@ -811,7 +816,7 @@ sarif_builder::maybe_make_physical_location_object 
(location_t loc)
 json::object *
 sarif_builder::make_artifact_location_object (location_t loc)
 {
-  return make_artifact_location_object (LOCATION_FILE (loc));
+  return make_artifact_location_object (LOCATION_SRC (loc));
 }
 
 /* The ID value for use in "uriBaseId" properties (SARIF v2.1.0 section 3.4.4)
@@ -823,10 +828,13 @@ sarif_builder::make_artifact_location_object (location_t 
loc)
or return NULL.  */
 
 json::object *
-sarif_builder::make_artifact_location_object (const char *filename)
+sarif_builder::make_artifact_location_object (source_id src)
 {
   json::object *artifact_loc_obj = new json::object ();
 
+  const auto filename = src.is_buffer ()
+? special_fname_generated () : src.get_filename_or_buffer ();
+
   /* "uri" property (SARIF v2.1.0 section 3.4.3).  */
   artifact_loc_obj->set ("uri", new json::string (filename));
 
@@ -912,9 +920,9 @@ sarif_builder::maybe_make_region_object (location_t loc) 

[PATCH v4 1/8] libcpp: Add LC_GEN linemaps to support in-memory buffers

2023-08-09 Thread Lewis Hyatt via Gcc-patches
Add a new linemap reason LC_GEN which enables encoding the location of data
that was generated during compilation and does not appear in any source file.
There could be many use cases, such as, for instance, referring to the content
of builtin macros (not yet implemented, but an easy lift after this one.) The
first intended application is to create a place to store the input to a
_Pragma directive, so that proper locations can be assigned to those
tokens. This will be done in a subsequent commit.

The TO_FILE member of struct line_map_ordinary has been changed to a union
named SRC which can be either a file name, or a pointer to a line_map_data
struct describing the data. There is no space overhead added to the line
maps data structures.

Outside libcpp, this patch includes only the minimal changes implied by the
adjustment from TO_FILE to SRC in struct line_map_ordinary. Subsequent
patches will implement the new functionality.

libcpp/ChangeLog:

* include/line-map.h (enum lc_reason): Add LC_GEN.
(struct line_map_data): New struct.
(struct line_map_ordinary): Change TO_FILE from a char* to a union,
and rename to SRC.
(class source_id): New class.
(ORDINARY_MAP_GENERATED_DATA_P): New function.
(ORDINARY_MAP_GENERATED_DATA): New function.
(ORDINARY_MAP_GENERATED_DATA_LEN): New function.
(ORDINARY_MAP_SOURCE_ID): New function.
(ORDINARY_MAPS_SAME_FILE_P): New function.
(ORDINARY_MAP_CONTAINING_FILE_NAME): Declare.
(LINEMAP_FILE): Adapt to struct line_map_ordinary change.
(linemap_get_file_highest_location): Likewise.
* line-map.cc (source_id::operator==): New function.
(ORDINARY_MAP_CONTAINING_FILE_NAME): New function.
(linemap_add): Support creating LC_GEN maps.
(linemap_line_start): Support LC_GEN maps.
(linemap_check_files_exited): Likewise.
(linemap_position_for_loc_and_offset): Likewise.
(linemap_get_expansion_filename): Likewise.
(linemap_dump): Likewise.
(linemap_dump_location): Likewise.
(linemap_get_file_highest_location): Likewise.
* directives.cc (_cpp_do_file_change): Likewise.

gcc/c-family/ChangeLog:

* c-common.cc (try_to_locate_new_include_insertion_point): Recognize
and ignore LC_GEN maps.

gcc/cp/ChangeLog:

* module.cc (module_state::write_ordinary_maps): Recognize and
ignore LC_GEN maps, and adapt to interface change in struct
line_map_ordinary.
(module_state::read_ordinary_maps): Likewise.

gcc/ChangeLog:

* diagnostic-show-locus.cc (compatible_locations_p): Adapt to
interface change in struct line_map_ordinary.
* input.cc (special_fname_generated): New function.
(dump_location_info): Support LC_GEN maps.
(get_substring_ranges_for_loc): Adapt to interface change in struct
line_map_ordinary.
* input.h (special_fname_generated): Declare.

gcc/go/ChangeLog:

* go-linemap.cc (Gcc_linemap::to_string): Recognize and ignore
LC_GEN maps.
---
 gcc/c-family/c-common.cc |  11 ++-
 gcc/cp/module.cc |   8 +-
 gcc/diagnostic-show-locus.cc |   2 +-
 gcc/go/go-linemap.cc |   3 +-
 gcc/input.cc |  27 +-
 gcc/input.h  |   1 +
 libcpp/directives.cc |   4 +-
 libcpp/include/line-map.h| 144 
 libcpp/line-map.cc   | 181 +--
 9 files changed, 299 insertions(+), 82 deletions(-)

diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
index 9fbaeb437a1..ecfc2efc29f 100644
--- a/gcc/c-family/c-common.cc
+++ b/gcc/c-family/c-common.cc
@@ -9206,19 +9206,22 @@ try_to_locate_new_include_insertion_point (const char 
*file, location_t loc)
   const line_map_ordinary *ord_map
= LINEMAPS_ORDINARY_MAP_AT (line_table, i);
 
+  if (ORDINARY_MAP_GENERATED_DATA_P (ord_map))
+   continue;
+
   if (const line_map_ordinary *from
  = linemap_included_from_linemap (line_table, ord_map))
/* We cannot use pointer equality, because with preprocessed
   input all filename strings are unique.  */
-   if (0 == strcmp (from->to_file, file))
+   if (ORDINARY_MAP_SOURCE_ID (from) == file)
  {
last_include_ord_map = from;
last_ord_map_after_include = NULL;
  }
 
-  /* Likewise, use strcmp, and reject any line-zero introductory
-map.  */
-  if (ord_map->to_line && 0 == strcmp (ord_map->to_file, file))
+  /* Likewise, use strcmp (via the source_id comparison), and reject any
+line-zero introductory map.  */
+  if (ord_map->to_line && ORDINARY_MAP_SOURCE_ID (ord_map) == file)
{
  if (!first_ord_map_in_file)
first_ord_map_in_file = ord_map;
diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index ea362bdffa4..ff17cd57016 100644
--- 

[PATCH v4 2/8] libcpp: diagnostics: Support generated data in expanded locations

2023-08-09 Thread Lewis Hyatt via Gcc-patches
The previous patch in this series introduced the concept of LC_GEN line
maps. This patch continues on the path to using them to improve _Pragma
diagnostics, by adding a new source_id SRC member to struct
expanded_location, which is populated by linemap_expand_location. This
member allows call sites to detect and handle when a location refers to
generated data rather than a plain file name.

The previous FILE member of expanded_location is preserved (although
redundant with SRC), so that call sites which do not and never will care
about generated data do not need to be concerned about it. Call sites that
will care are modified here, to use SRC rather than FILE for comparing
locations.

libcpp/ChangeLog:

* include/line-map.h (struct expanded_location): Add SRC member. Add
zero-initializers for all members, since source_id is not a POD
type.
(class fixit_hint): Adjust prototype.
* line-map.cc (linemap_expand_location): Populate the new SRC member
in the expanded_location.
(rich_location::maybe_add_fixit): Compare explocs with the new SRC
field instead of the FILE field.
(fixit_hint::affects_line_p): Accept a source_id instead of a file
name, and use it for the comparisons.

gcc/c-family/ChangeLog:

* c-format.cc (get_corrected_substring): Compare explocs with the
new SRC field instead of the FILE field.
* c-indentation.cc (should_warn_for_misleading_indentation): Likewise.
(assert_get_visual_column_succeeds): Initialize the SRC field in the
test expanded_location.
(assert_get_visual_column_fails): Likewise.

gcc/ChangeLog:

* diagnostic-show-locus.cc (make_range): Adapt to the new
constructor semantics for struct expanded_location.
(layout::maybe_add_location_range): Compare explocs with the new SRC
field instead of the FILE field.
(layout::validate_fixit_hint_p): Likewise.
(layout::print_leading_fixits): Use the SRC field in struct
expanded_location to query fixit_hint::affects_line_p.
(layout::print_trailing_fixits): Likewise.
* diagnostic.cc (diagnostic_report_current_module): Use the new SRC
field in expanded_location to detect LC_GEN locations and identify
them as such.
(assert_location_text): Adapt to the new constructor semantics for
struct expanded_location.
* input.cc (expand_location_1): Likewise. And when libcpp's
linemap_expand_location returns a null FILE for generated data,
replace it with special_fname_generated ().
(total_lines_num): Handle a generic source_id argument rather than a
file name only.
(get_source_text_between): Compare explocs with the new SRC field
instead of the FILE field.
(get_substring_ranges_for_loc): Likewise.
* edit-context.cc (edit_context::apply_fixit): Ignore locations in
generated data.
* input.h (LOCATION_SRC): New accessor macro.
---
 gcc/c-family/c-format.cc  |  4 ++--
 gcc/c-family/c-indentation.cc | 10 +-
 gcc/diagnostic-show-locus.cc  | 30 +-
 gcc/diagnostic.cc | 19 ---
 gcc/edit-context.cc   |  2 +-
 gcc/input.cc  | 21 +++--
 gcc/input.h   |  1 +
 libcpp/include/line-map.h | 24 ++--
 libcpp/line-map.cc| 15 +++
 9 files changed, 70 insertions(+), 56 deletions(-)

diff --git a/gcc/c-family/c-format.cc b/gcc/c-family/c-format.cc
index b4eeebcb30e..529b1408179 100644
--- a/gcc/c-family/c-format.cc
+++ b/gcc/c-family/c-format.cc
@@ -4522,9 +4522,9 @@ get_corrected_substring (const substring_loc _loc,
 = expand_location_to_spelling_point (fmt_substring_range.m_start);
   expanded_location finish
 = expand_location_to_spelling_point (fmt_substring_range.m_finish);
-  if (caret.file != start.file)
+  if (caret.src != start.src)
 return NULL;
-  if (start.file != finish.file)
+  if (start.src != finish.src)
 return NULL;
   if (caret.line != start.line)
 return NULL;
diff --git a/gcc/c-family/c-indentation.cc b/gcc/c-family/c-indentation.cc
index e8d3dece770..fce74991aae 100644
--- a/gcc/c-family/c-indentation.cc
+++ b/gcc/c-family/c-indentation.cc
@@ -334,7 +334,7 @@ should_warn_for_misleading_indentation (const 
token_indent_info _tinfo,
   const unsigned int tab_width = global_dc->tabstop;
 
   /* They must be in the same file.  */
-  if (next_stmt_exploc.file != body_exploc.file)
+  if (next_stmt_exploc.src != body_exploc.src)
 return false;
 
   /* If NEXT_STMT_LOC and BODY_LOC are on the same line, consider
@@ -363,7 +363,7 @@ should_warn_for_misleading_indentation (const 
token_indent_info _tinfo,
   ^ DON'T WARN HERE.  */
   if (next_stmt_exploc.line == body_exploc.line)
 {
-  if (guard_exploc.file != 

[PATCH v4 0/8] diagnostics: libcpp: Overhaul locations for _Pragma tokens

2023-08-09 Thread Lewis Hyatt via Gcc-patches
On Mon, Jul 31, 2023 at 06:39:15PM -0400, Lewis Hyatt wrote:
> On Fri, Jul 28, 2023 at 6:58 PM David Malcolm  wrote:
> >
> > On Fri, 2023-07-21 at 19:08 -0400, Lewis Hyatt wrote:
> > > Add a new linemap reason LC_GEN which enables encoding the location
> > > of data
> > > that was generated during compilation and does not appear in any
> > > source file.
> > > There could be many use cases, such as, for instance, referring to
> > > the content
> > > of builtin macros (not yet implemented, but an easy lift after this
> > > one.) The
> > > first intended application is to create a place to store the input to
> > > a
> > > _Pragma directive, so that proper locations can be assigned to those
> > > tokens. This will be done in a subsequent commit.
> > >
> > > The actual change needed to the line-maps API in libcpp is not too
> > > large and
> > > requires no space overhead in the line map data structures (on 64-bit
> > > systems
> > > that is; one newly added data member to class line_map_ordinary sits
> > > inside
> > > former padding bytes.) An LC_GEN map is just an ordinary map like any
> > > other,
> > > but the TO_FILE member that normally points to the file name points
> > > instead to
> > > the actual data.  This works automatically with PCH as well, for the
> > > same
> > > reason that the file name makes its way into a PCH.  In order to
> > > avoid
> > > confusion, the member has been renamed from TO_FILE to DATA, and
> > > associated
> > > accessors adjusted.
> > >
> > > Outside libcpp, there are many small changes but most of them are to
> > > selftests, which are necessarily more sensitive to implementation
> > > details. From the perspective of the user (the "user", here, being a
> > > frontend
> > > using line maps or else the diagnostics infrastructure), the chief
> > > visible
> > > change is that the function location_get_source_line() should be
> > > passed an
> > > expanded_location object instead of a separate filename and line
> > > number.  This
> > > is not a big change because in most cases, this information came
> > > anyway from a
> > > call to expand_location and the needed expanded_location object is
> > > readily
> > > available. The new overload of location_get_source_line() uses the
> > > extra
> > > information in the expanded_location object to obtain the data from
> > > the
> > > in-memory buffer when it originated from an LC_GEN map.
> > >
> > > Until the subsequent patch that starts using LC_GEN maps, none are
> > > yet
> > > generated within GCC, hence nothing is added to the testsuite here;
> > > but all
> > > relevant selftests have been extended to cover generated data maps in
> > > addition
> > > to normal files.
> >
> > [..snip...]
> >
> > Thanks for the updated patch.
> >
> > Reading this patch, it felt a bit unnatural to me to have an
> >   (exploded location, source line)
> > pair where the exploded location seems to be representing "which source
> > file or generated buffer", but the line/column info in that
> > exploded_location is to be ignored in favor of the 2nd source line.
> >
> > I think we're missing a class: something that identifies either a
> > specific source file, or a specific generated buffer.
> >
> > How about something like either:
> >
> > class source_id
> > {
> > public:
> >   source_id (const char *filename)
> >   : m_filename_or_buffer (filename),
> > m_len (0)
> >   {
> >   }
> >
> >   explicit source_id (const char *buffer, unsigned buffer_len)
> >   : m_filename_or_buffer (buffer),
> > m_len (buffer_len)
> >   {
> > linemap_assert (buffer_len > 0);
> >   }
> >
> > private:
> >   const char *m_filename_or_buffer;
> >   unsigned m_len;  // where 0 means "it's a filename"
> > };
> >
> > or:
> >
> > class source_id
> > {
> > public:
> >   source_id (const char *filename)
> >   : m_ptr (filename),
> > m_is_buffer (false)
> >   {
> >   }
> >
> >   explicit source_id (const linemap_ordinary *buffer_linemap)
> >   : m_ptr (buffer_linemap),
> > m_is_buffer (true)
> >   {
> >   }
> >
> > private:
> >   const void *m_ptr;
> >   bool m_is_buffer;
> > };
> >
> > and use one of these "source_id file" in place of "const char *file",
> > rather than replacing such things with expanded_location?
> >
> > > diff --git a/gcc/c-family/c-indentation.cc b/gcc/c-family/c-indentation.cc
> > > index e8d3dece770..4164fa0b1ba 100644
> > > --- a/gcc/c-family/c-indentation.cc
> > > +++ b/gcc/c-family/c-indentation.cc
> > > @@ -50,7 +50,7 @@ get_visual_column (expanded_location exploc,
> > >  unsigned int *first_nws,
> > >  unsigned int tab_width)
> > >  {
> > > -  char_span line = location_get_source_line (exploc.file, exploc.line);
> > > +  char_span line = location_get_source_line (exploc);
> >
> > ...so this might contine to be:
> >
> >   char_span line = location_get_source_line (exploc.file, exploc.line);
> >
> > ...but expanded_location's "file" field would become a source_id,
> > rather than a const char *.  It 

[Bug tree-optimization/100095] missed optimization for dead code elimination at -O3 (vs. -O2)

2023-08-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100095

Andrew Pinski  changed:

   What|Removed |Added

  Known to work||12.1.0
   Keywords||needs-bisection

--- Comment #2 from Andrew Pinski  ---
Looks to be fixed for GCC 12.

[Bug tree-optimization/27109] Simplify "a - 10 > 150" into "a > 160" when range of a is known (in VRP or somewhere else)

2023-08-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=27109

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org

--- Comment #7 from Andrew Pinski  ---
I suspect we could add it to simplify_compare_using_ranges_1 very easy.
Let me look into that.

Re: [PATCH V2] riscv: generate builtin macro for compilation with strict alignment:

2023-08-09 Thread Edwin Lu

Hi Vineet,

On 8/8/2023 2:02 PM, Vineet Gupta wrote:


Maybe add a comment that in absence of -m[no-]strict-align, we use the 
cpu tune param -> slow_unaligned_access and that default mcpu is rocket 
which has that set to _slow.




That sounds good to me!


+#if defined(__riscv_unaligned_avoid)
+#error "__riscv_unaligned_avoid is unexpectedly set"
+#endif


Lets first check what is really expected.
#if !defined (_slow) #error


+
+/* Check __riscv_unaligned_slow xor __riscv_unaligned_fast is set.  */
+#if !defined(__riscv_unaligned_slow) && !defined(__riscv_unaligned_fast)
+#error "either __riscv_unaligned_slow or__riscv_unaligned_fast is not 
set"

+#endif
+
+#if defined(__riscv_unaligned_slow) && defined(__riscv_unaligned_fast)
+#error "both __riscv_unaligned_slow and__riscv_unaligned_fast are set"
+#endif
+


I think we can simplify this a bit (sorry I if wasn't clear enough in 
our off-list discussions).
We now need to ensure that unexpected toggles are NOT defined: #if 
defined(_avoid) || defined(_fast) #error
I don't think we need to check for the combinations as that is covered 
by first one and this.


While separate #error prints for 2 failures are ideal, personally it 
feels excessive given that the current implementation will only ever 
generate 1 of them. If a future code change breaks that assumption, the 
onus is on that change to fix/update the errors.




That makes sense to me. I was thinking that adding some more checks 
would help clarify the assumptions but it did just make things overly 
verbose.




Same as my comment for attribute-1. Please recheck all of them.



Thanks for the feedback! I will be sure to update all of them in the 
next iteration.


Edwin



Re: [PATCH v2] analyzer: More features for CPython analyzer plugin [PR107646]

2023-08-09 Thread David Malcolm via Gcc
On Wed, 2023-08-09 at 15:22 -0400, Eric Feng wrote:
> Thank you for your help in getting dg-require-python-h working! I can
> confirm that the FAILs are related to differences between the --
> cflags
> affecting the gimple seen by the analyzer. For this reason, I have
> changed it to --includes for now. 

Sounds good.

Eventually we'll probably want to support --cflags, but given that
every distribution probably has its own set of flags, it's a recipe for
an unpleasantly large test matrix, so just using --includes is a good
compromise.

> To be sure, I tested on Python 3.8 as
> well and it works as expected. I have also addressed the following
> comments on the WIP patch as you described.
> 
> -- Update Changelog entry to list new functions being simulated.
> -- Update region_model::get_or_create_region_for_heap_alloc leading
> comment.
> -- Change register_alloc to update_state_machine.
> -- Change move_ptr_sval_non_null to transition_ptr_sval_non_null.
> -- Static helper functions for:
> -- Initializing ob_refcnt field.
> -- Setting ob_type field.
> -- Getting ob_base field.
> -- Initializing heap allocated region for PyObjects.
> -- Incrementing a field by one.
> -- Change arg_is_long_p to arg_is_integral_p.
> -- Extract common failure scenario for reusability.
> 
> The initial WIP patch using 
> 
> /* { dg-options "-fanalyzer -I/usr/include/python3.9" }. */
> 
> have been bootstrapped and regtested on aarch64-unknown-linux-gnu.
> Since
> we did not change any core logic in the revision and the only changes
> within the analyzer core are changing variable names, is it OK for
> trunk. In the mean time, the revised patch is currently going through
> bootstrap and regtest process.

Thanks for the updated patch.

Unfortunately I just pushed a largish analyzer patch (r14-3114-
g73da34a538ddc2) which may well conflict with your patch, so please
rebase to beyond that.  

Sorry about this.

In particular note that there's no longer a default assignment to the
LHS at a call-site in region_model::on_call_pre; known_function
subclasses are now responsible for assigning to the LHS of the
callsite.  But I suspect that all the known_function subclasses in the
cpython plugin already do that.

Some nits inline below...

[...snip...]

> Some concessions were made to
> simplify the analysis process when comparing kf_PyList_Append with
> the
> real implementation. In particular, PyList_Append performs some
> optimization internally to try and avoid calls to realloc if
> possible. For simplicity, we assume that realloc is called every
> time.
> Also, we grow the size by just 1 (to ensure enough space for adding a
> new element) rather than abide by the heuristics that the actual
> implementation
> follows.

Might be worth capturing these notes as comments in the source (for
class kf_PyList_Append), rather than just within the commit message.

[...snip...]
> 
> diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-
> model.cc
> index e92b3f7b074..c338f045d92 100644
> --- a/gcc/analyzer/region-model.cc
> +++ b/gcc/analyzer/region-model.cc
> @@ -5127,11 +5127,16 @@ region_model::check_dynamic_size_for_floats
> (const svalue *size_in_bytes,
>     Use CTXT to complain about tainted sizes.
>  
>     Reuse an existing heap_allocated_region if it's not being
> referenced by
> -   this region_model; otherwise create a new one.  */
> +   this region_model; otherwise create a new one.
> +
> +   Optionally (update_state_machine) transitions the pointer
> pointing to the
> +   heap_allocated_region from start to assumed non-null.  */
>  
>  const region *
>  region_model::get_or_create_region_for_heap_alloc (const svalue
> *size_in_bytes,
> - 
> region_model_context *ctxt)
> +   region_model_context *ctxt,
> +   bool update_state_machine,
> +   const call_details *cd)
>  {
>    /* Determine which regions are referenced in this region_model, so
> that
>   we can reuse an existing heap_allocated_region if it's not in
> use on
> @@ -5153,6 +5158,17 @@
> region_model::get_or_create_region_for_heap_alloc (const svalue
> *size_in_bytes,
>    if (size_in_bytes)
>  if (compat_types_p (size_in_bytes->get_type (), size_type_node))
>    set_dynamic_extents (reg, size_in_bytes, ctxt);
> +
> +   if (update_state_machine && cd)
> +   {
> +   const svalue *ptr_sval = nullptr;
> +   if (cd->get_lhs_type ())
> +   ptr_sval = m_mgr->get_ptr_svalue (cd->get_lhs_type (), reg);
> +   else
> +   ptr_sval = m_mgr->get_ptr_svalue (NULL_TREE, reg);
> +   transition_ptr_sval_non_null (ctxt,
> ptr_sval);

This if/else is redundant: the "else" is only reached if cd-
>get_lhs_type () is null, in which case you pass in NULL_TREE, so it
works the same either way.  Or am I missing something?

Also, it looks like something weird's happening with 

Re: [PATCH v2] analyzer: More features for CPython analyzer plugin [PR107646]

2023-08-09 Thread David Malcolm via Gcc-patches
On Wed, 2023-08-09 at 15:22 -0400, Eric Feng wrote:
> Thank you for your help in getting dg-require-python-h working! I can
> confirm that the FAILs are related to differences between the --
> cflags
> affecting the gimple seen by the analyzer. For this reason, I have
> changed it to --includes for now. 

Sounds good.

Eventually we'll probably want to support --cflags, but given that
every distribution probably has its own set of flags, it's a recipe for
an unpleasantly large test matrix, so just using --includes is a good
compromise.

> To be sure, I tested on Python 3.8 as
> well and it works as expected. I have also addressed the following
> comments on the WIP patch as you described.
> 
> -- Update Changelog entry to list new functions being simulated.
> -- Update region_model::get_or_create_region_for_heap_alloc leading
> comment.
> -- Change register_alloc to update_state_machine.
> -- Change move_ptr_sval_non_null to transition_ptr_sval_non_null.
> -- Static helper functions for:
> -- Initializing ob_refcnt field.
> -- Setting ob_type field.
> -- Getting ob_base field.
> -- Initializing heap allocated region for PyObjects.
> -- Incrementing a field by one.
> -- Change arg_is_long_p to arg_is_integral_p.
> -- Extract common failure scenario for reusability.
> 
> The initial WIP patch using 
> 
> /* { dg-options "-fanalyzer -I/usr/include/python3.9" }. */
> 
> have been bootstrapped and regtested on aarch64-unknown-linux-gnu.
> Since
> we did not change any core logic in the revision and the only changes
> within the analyzer core are changing variable names, is it OK for
> trunk. In the mean time, the revised patch is currently going through
> bootstrap and regtest process.

Thanks for the updated patch.

Unfortunately I just pushed a largish analyzer patch (r14-3114-
g73da34a538ddc2) which may well conflict with your patch, so please
rebase to beyond that.  

Sorry about this.

In particular note that there's no longer a default assignment to the
LHS at a call-site in region_model::on_call_pre; known_function
subclasses are now responsible for assigning to the LHS of the
callsite.  But I suspect that all the known_function subclasses in the
cpython plugin already do that.

Some nits inline below...

[...snip...]

> Some concessions were made to
> simplify the analysis process when comparing kf_PyList_Append with
> the
> real implementation. In particular, PyList_Append performs some
> optimization internally to try and avoid calls to realloc if
> possible. For simplicity, we assume that realloc is called every
> time.
> Also, we grow the size by just 1 (to ensure enough space for adding a
> new element) rather than abide by the heuristics that the actual
> implementation
> follows.

Might be worth capturing these notes as comments in the source (for
class kf_PyList_Append), rather than just within the commit message.

[...snip...]
> 
> diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-
> model.cc
> index e92b3f7b074..c338f045d92 100644
> --- a/gcc/analyzer/region-model.cc
> +++ b/gcc/analyzer/region-model.cc
> @@ -5127,11 +5127,16 @@ region_model::check_dynamic_size_for_floats
> (const svalue *size_in_bytes,
>     Use CTXT to complain about tainted sizes.
>  
>     Reuse an existing heap_allocated_region if it's not being
> referenced by
> -   this region_model; otherwise create a new one.  */
> +   this region_model; otherwise create a new one.
> +
> +   Optionally (update_state_machine) transitions the pointer
> pointing to the
> +   heap_allocated_region from start to assumed non-null.  */
>  
>  const region *
>  region_model::get_or_create_region_for_heap_alloc (const svalue
> *size_in_bytes,
> - 
> region_model_context *ctxt)
> +   region_model_context *ctxt,
> +   bool update_state_machine,
> +   const call_details *cd)
>  {
>    /* Determine which regions are referenced in this region_model, so
> that
>   we can reuse an existing heap_allocated_region if it's not in
> use on
> @@ -5153,6 +5158,17 @@
> region_model::get_or_create_region_for_heap_alloc (const svalue
> *size_in_bytes,
>    if (size_in_bytes)
>  if (compat_types_p (size_in_bytes->get_type (), size_type_node))
>    set_dynamic_extents (reg, size_in_bytes, ctxt);
> +
> +   if (update_state_machine && cd)
> +   {
> +   const svalue *ptr_sval = nullptr;
> +   if (cd->get_lhs_type ())
> +   ptr_sval = m_mgr->get_ptr_svalue (cd->get_lhs_type (), reg);
> +   else
> +   ptr_sval = m_mgr->get_ptr_svalue (NULL_TREE, reg);
> +   transition_ptr_sval_non_null (ctxt,
> ptr_sval);

This if/else is redundant: the "else" is only reached if cd-
>get_lhs_type () is null, in which case you pass in NULL_TREE, so it
works the same either way.  Or am I missing something?

Also, it looks like something weird's happening with 

[Bug tree-optimization/95747] [OpenMP/Builtin] nontemporal store support

2023-08-09 Thread freddie at witherden dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95747

Freddie Witherden  changed:

   What|Removed |Added

 CC||freddie at witherden dot org

--- Comment #1 from Freddie Witherden  ---
It looks as if recent versions of GCC support the nontemporal OpenMP 5
attribute as:

typedef double aligned_double __attribute__ ((aligned (64)));

void f(int n, aligned_double * a, aligned_double * b)
{
#pragma omp simd nontemporal(b)
for (int i = 0; i < n; i++)
b[i] = 2*a[i];
}

But do not actually generate non-temporal store instructions (even though the
data is suitably aligned).

Re: [PATCH 0/12] GCC _BitInt support [PR102989]

2023-08-09 Thread Joseph Myers
On Wed, 9 Aug 2023, Jakub Jelinek via Gcc-patches wrote:

> - _Complex _BitInt(N) isn't supported; again mainly because none of the psABIs
>   mention how those should be passed/returned; in a limited way they are
>   supported internally because the internal functions into which
>   __builtin_{add,sub,mul}_overflow{,_p} is lowered return COMPLEX_TYPE as a
>   hack to return 2 values without using references/pointers

What happens when the usual arithmetic conversions are applied to 
operands, one of which is a complex integer type and the other of which is 
a wider _BitInt type?  I don't see anything in the code to disallow this 
case (which would produce an expression with a _Complex _BitInt type), or 
any testcases for it.

Other testcases I think should be present (along with any corresponding 
changes needed to the code itself):

* Verifying that the new integer constant suffix is rejected for C++.

* Verifying appropriate pedwarn-if-pedantic for the new constant suffix 
for versions of C before C2x (and probably for use of _BitInt type 
specifiers before C2x as well) - along with the expected -Wc11-c2x-compat 
handling (in C2x mode) / -pedantic -Wno-c11-c2x-compat in older modes.

-- 
Joseph S. Myers
jos...@codesourcery.com


[PING 2][PATCH 0/9] Add btf_decl_tag C attribute

2023-08-09 Thread David Faust via Gcc-patches
Ping for this series.

https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624156.html

Thanks

On 7/11/23 14:57, David Faust via Gcc-patches wrote:
> Hello,
> 
> This series adds support for a new attribute, "btf_decl_tag" in GCC.
> The same attribute is already supported in clang, and is used by various
> components of the BPF ecosystem.
> 
> The purpose of the attribute is to allow to associate (to "tag")
> declarations with arbitrary string annotations, which are emitted into
> debugging information (DWARF and/or BTF) to facilitate post-compilation
> analysis (the motivating use case being the Linux kernel BPF verifier).
> Multiple tags are allowed on the same declaration.
> 
> These strings are not interpreted by the compiler, and the attribute
> itself has no effect on generated code, other than to produce additional
> DWARF DIEs and/or BTF records conveying the annotations.
> 
> This entails:
> 
> - A new C-language-level attribute which allows to associate (to "tag")
>   particular declarations with arbitrary strings.
> 
> - The conveyance of that information in DWARF in the form of a new DIE,
>   DW_TAG_GNU_annotation, with tag number (0x6000) and format matching
>   that of the DW_TAG_LLVM_annotation extension supported in LLVM for
>   the same purpose. These DIEs are already supported by BPF tooling,
>   such as pahole.
> 
> - The conveyance of that information in BTF debug info in the form of
>   BTF_KIND_DECL_TAG records. These records are already supported by
>   LLVM and other tools in the eBPF ecosystem, such as the Linux kernel
>   eBPF verifier.
> 
> 
> Background
> ==
> 
> The purpose of these tags is to convey additional semantic information
> to post-compilation consumers, in particular the Linux kernel eBPF
> verifier. The verifier can make use of that information while analyzing
> a BPF program to aid in determining whether to allow or reject the
> program to be run. More background on these tags can be found in the
> early support for them in the kernel here [1] and [2].
> 
> The "btf_decl_tag" attribute is half the story; the other half is a
> sibling attribute "btf_type_tag" which serves the same purpose but
> applies to types. Support for btf_type_tag will come in a separate
> patch series, since it is impaced by GCC bug 110439 which needs to be
> addressed first.
> 
> I submitted an initial version of this work (including btf_type_tag)
> last spring [3], however at the time there were some open questions
> about the behavior of the btf_type_tag attribute and issues with its
> implementation. Since then we have clarified these details and agreed
> to solutions with the BPF community and LLVM BPF folks.
> 
> The main motivation for emitting the tags in DWARF is that the Linux
> kernel generates its BTF information via pahole, using DWARF as a source:
> 
> ++  BTF  BTF   +--+
> | pahole |---> vmlinux.btf --->| verifier |
> ++ +--+
> ^^
> ||
>   DWARF |BTF |
> ||
>   vmlinux  +-+
>   module1.ko   | BPF program |
>   module2.ko   +-+
> ...
> 
> This is because:
> 
> a)  pahole adds additional kernel-specific information into the
> produced BTF based on additional analysis of kernel objects.
> 
> b)  Unlike GCC, LLVM will only generate BTF for BPF programs.
> 
> b)  GCC can generate BTF for whatever target with -gbtf, but there is no
> support for linking/deduplicating BTF in the linker.
> 
> In the scenario above, the verifier needs access to the pointer tags of
> both the kernel types/declarations (conveyed in the DWARF and translated
> to BTF by pahole) and those of the BPF program (available directly in BTF).
> 
> 
> DWARF Representation
> 
> 
> As noted above, btf_decl_tag is represented in DWARF via a new DIE
> DW_TAG_GNU_annotation, with identical format to the LLVM DWARF
> extension DW_TAG_LLVM_annotation serving the same purpose. The DIE has
> the following format:
> 
>   DW_TAG_GNU_annotation (0x6000)
> DW_AT_name: "btf_decl_tag"
> DW_AT_const_value: 
> 
> These DIEs are placed in the DWARF tree as children of the DIE for the
> appropriate declaration, and one such DIE is created for each occurrence
> of the btf_decl_tag attribute on a declaration.
> 
> For example:
> 
>   const int * c __attribute__((btf_decl_tag ("__c"), btf_decl_tag 
> ("devicemem")));
> 
> This declaration produces the following DWARF:
> 
>  <1><1e>: Abbrev Number: 2 (DW_TAG_variable)
> <1f>   DW_AT_name: c
> <24>   DW_AT_type: <0x49>
> ...
>  <2><36>: Abbrev Number: 3 (User TAG value: 0x6000)
> <37>   DW_AT_name: (indirect string, 

Re: LRA for avr: Arithmetic on stack pointer

2023-08-09 Thread Vladimir Makarov via Gcc



On 8/9/23 07:15, senthilkumar.selva...@microchip.com wrote:

Hi,

   After turning on FP -> SP elimination after Vlad fixed
   an elimination issue in 
https://gcc.gnu.org/git?p=gcc.git;a=commit;h=2971ff7b1d564ac04b537d907c70e6093af70832,
   I'm now running into reload failure if arithmetic is done on SP.

   For a call to a vararg functions, the avr target pushes args into the stack,
   calls the function, and then adjusts the SP back to where it was before the
   arg pushing occurred.

   So for code like

extern int foo(int, ...);
int bar(void) {
   long double l = 1.2345E6;
   foo(0, l);
   return 0;
}

With some efforts, I reproduced this problem.

   and

$ avr-gcc -mmcu=avr51 -Os ../20031208-1.c
   

...



   I guess the condition exists to ensure sp_off is always correct? Considering 
LRA already
   handles post_dec of SP just fine, perhaps it can allow RTX like


It is a very old code when LRA elimination was pretty constraint.


(set (reg/f:HI 32 __SP_L__)
  (plus:HI (reg/f:HI 32 __SP_L__)
   (const_int 10 [0xa]))) "../20031208-1.c":5:10 discrim 1 165 
{*addhi3_split}

   as long as the PLUS/MINUS is by a constant, and update sp_off accordingly?

   Or is there something the avr target has to do differently?


I think we can permit to stack pointer output reloads.  The only thing 
we need to update sp offset accurately for the original and reload 
insns.  I'll try to make the patch on this week.





Re: Intel AVX10.1 Compiler Design and Support

2023-08-09 Thread Jakub Jelinek via Gcc-patches
On Wed, Aug 09, 2023 at 08:43:00PM +, Joseph Myers wrote:
> At this point it seems appropriate to remind people of another ABI 
> consideration for vector extensions.  glibc's libmvec defines vector 
> versions of various functions, including AVX512 ones (of course those 
> function versions only work on hardware with the relevant instructions).  
> glibc's headers use both _Pragma ("omp declare simd notinbranch") and 
> __attribute__ ((__simd__ ("notinbranch"))) to declare, to the compiler 
> including those headers, what function variants are available in glibc.

For omp declare simd or simd attribute that simply implies that the
variants with 512-bit vectors may only be called from -mavx512f or
-mavx10.1-512 (or how the switch will be called code), not from -mavx10.1.
We shouldn't change that ABI because of AVX10.

Jakub



Re: [patch] libgomp: cuda.h and omp_target_memcpy_rect cleanup (was: [patch] OpenMP: Call cuMemcpy2D/cuMemcpy3D for nvptx for omp_target_memcpy_rect)

2023-08-09 Thread Thomas Schwinge
Hi Tobias!

On 2023-07-28T13:51:41+0200, Tobias Burnus  wrote:
> On 27.07.23 23:00, Thomas Schwinge wrote:
>>> +  else if (src_devicep != NULL
>>> +&& (dst_devicep == NULL
>>> +|| (dst_devicep->capabilities
>>> +& GOMP_OFFLOAD_CAP_SHARED_MEM)))
>> Are these 'GOMP_OFFLOAD_CAP_SHARED_MEM' actually reachable, given that
>> 'omp_target_memcpy_check' (via 'omp_target_memcpy_rect_check') clears out
>> the device to 'NULL' for 'GOMP_OFFLOAD_CAP_SHARED_MEM'?
>
> I have now undone this change – I did not dig deep enough into the
> function calls.
>
>
>>> +  else if (dst_devicep == NULL && src_devicep == NULL)
>>> + {
>>> +   memcpy ((char *) dst + dst_off, (const char *) src + src_off,
>>> +   length);
>>> +   ret = 1;
>>> + }
>>> else if (src_devicep == dst_devicep)
>>>ret = src_devicep->dev2dev_func (src_devicep->target_id,
>>> (char *) dst + dst_off,
>>> (const char *) src + src_off,
>>> length);
>> ..., but also left the intra-device case here -- which should now be dead
>> code here?
>
> Why? Unless I missed something, the old, the current, and the proposed
> (= old) code do still run this code.

It is now again reachable, but wasn't in the "intermediate state"
(without your follow-on change) -- at least per my understanding, which
may be gappy.

> I have not added an assert to confirm, but in any case, it is tested for
> in my recently added testcase - thus, we could add a 'printf' to confirm.

We're now back to the original code, which should be fine.

>>> +   else if (*tmp_size < length)
>>> + {
>>> +   *tmp_size = length;
>>> +   *tmp = realloc (*tmp, length);
>>> +   if (*tmp == NULL)
>>> + return ENOMEM;
>> If 'realloc' returns 'NULL', we should 'free' the original '*tmp'?
>>
>> Do we really need here the property here that if the re-allocation can't
>> be done in-place, 'realloc' copies the original content to the new?  In
>> other words, should we just unconditionally 'free' and re-'malloc' here,
>> instead of 'realloc'?
> I have now done so – but I am not really sure what's faster on average.
> If it can be enlarged, 'realloc' is faster, if it cannot free+malloc is
> better.

I have no proof, but would assume that the C library handles as
efficiently as 'realloc' the case of 'free' plus subsequent 'malloc' if
there's space available after the original allocation.

>> I haven't looked whether the re-use of 'tmp' for multiple calls to this
>> is then actually useful, or whether we should just always 'malloc', use,
>> 'free' the buffer here?

..., hence that suggestion.  But I agree what we've got now is good, just
one more idea: couldn't we now actually unify the (original) 'malloc' and
(original) 'realloc' case:

-if (*tmp_size == 0)
-  {
-*tmp_size = length;
-*tmp = malloc (length);
-if (*tmp == NULL)
-  return ENOMEM;
-  }
-else if (*tmp_size < length)
+if (*tmp_size < length)
   {
 *tmp_size = length;
 free (*tmp);
 *tmp = malloc (length);
 if (*tmp == NULL)
   return ENOMEM;
   }

(Untested.)

> Well, it can run in a hot loop – assume a C-array array[1024][1024][2]
> and copying array[:1024,:1024,0:1] (using OpenMP syntax) – i.e. 1048576
> times every other element. And therefore I would like to avoid repeated
> malloc/free in such a case. (But in general, interdevice copying should
> be very rare.)
>
> Actually, I think the realloc case is unreachable: for rectangular
> copies, as implied both by 'target update' with strided access and by
> 'omp_target_memcpy_rect', the size should be constant.

Worth an 'assert'?

Now I "only" don't understand the '__builtin_mul_overflow' computations
and error checks ('return EINVAL;') for the four cases handled in
'libgomp/target.c:omp_target_memcpy_rect_worker': for example, the
generic loop at end of function iterates over all 'num_dims', but the
specific earlier 'num_dims == N' cases don't.  But I suppose that's just
my own problem not understanding this API/code, and I'm not currently
planning on looking into this any further.  ;-)


Grüße
 Thomas
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


RE: Intel AVX10.1 Compiler Design and Support

2023-08-09 Thread Joseph Myers
On Wed, 9 Aug 2023, Wang, Phoebe via Gcc-patches wrote:

> Proposal 3: Change the ABI of 512-bit vector and always be 
> passed/returned from memory.

Changing ABIs like that for existing code that has worked for some time on 
existing hardware is a bad idea.

At this point it seems appropriate to remind people of another ABI 
consideration for vector extensions.  glibc's libmvec defines vector 
versions of various functions, including AVX512 ones (of course those 
function versions only work on hardware with the relevant instructions).  
glibc's headers use both _Pragma ("omp declare simd notinbranch") and 
__attribute__ ((__simd__ ("notinbranch"))) to declare, to the compiler 
including those headers, what function variants are available in glibc.

Existing glibc versions need to continue to work with new compiler 
versions.  That is, it's part of the ABI, which must remain stable, 
exactly which function versions the above pragma and attribute imply are 
available - and of course the details of how those functions versions take 
arguments / return results are also part of the ABI (it would be OK for a 
new compiler to choose not to use some of those vector versions, but not 
to start calling them with a different ABI).

Maybe you'll want to add new vector function versions, with different 
interfaces, to libmvec in future.  If so, you need a *different* pragma or 
attribute to declare to the compiler that the libmvec version using that 
pragma or attribute has the additional functions - so new compilers using 
the existing header will not try to generate calls to new function 
versions that don't exist in that glibc version (but new compilers using a 
new header version from new glibc will see the new pragma or attribute and 
so be able to generate the relevant calls to new functions).  And once 
you've defined the ABI for such a new pragma or attribute, that itself 
then becomes a stable interface - so if you end up with vector extensions 
involving yet another set of interfaces, they need another corresponding 
new pragma / attribute for libmvec to declare to the compiler that the new 
interfaces exist.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-08-09 Thread Qing Zhao via Gcc-patches


> On Aug 8, 2023, at 10:54 AM, Martin Uecker  wrote:
> 
> 
> 
> I am sure this has been discussed before, but seeing that you
> test for a specific formula, let me point out the following:
> 
> There at least three different size expression which could
> make sense. Consider
> 
> short foo { int a; short b; char t[]; }; 
> 
> Most people seem to use
> 
> sizeof(struct foo) + N * sizeof(foo->t);
> 
> which for N == 3 yields 11 bytes on x86-64 because the formula
> adds the padding of the original struct. There is an example
> in the  C standard that uses this formula.
> 
> 
> But he minimum size of an object which stores N elements is
> 
> max(sizeof (struct s), offsetof(struct s, t[n]))
> 
> which is 9 bytes. 
> 
> This is what clang uses for statically allocated objects with
> initialization, while GCC uses the rule above (11 bytes). But 
> bdos / bos  then returns the smaller size of 9 which is a bit
> confusing.

As I checked the algorithm for bos in GCC,  it uses a similar formula as the 
following to compute the object size:

offset(struct foo, t[0]) + N * sizeof(*foo->t);

Which seems correct to me.  (Therefore bos returns 9 for this example).
> 
> 
> https://godbolt.org/z/K1hvaK1ns
> 
> https://github.com/llvm/llvm-project/issues/62929
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109956
> 
> 
> Then there is also the size of a similar array where the FAM
> is replaced with an array of static size:
> 
> struct foo { int a; short b; char t[3]; }; 
> 
> This would make the most sense to me, but it has 12 bytes
> because the padding is according to the usual alignment
> rules.

My understanding is, since a structure with FAM cannot be an element of an 
array, 
 the tailing padding might not be necessary?

Qing

> 
> 
> Martin
> 
> 
> 
> Am Montag, dem 07.08.2023 um 09:16 -0700 schrieb Kees Cook:
>> On Fri, Aug 04, 2023 at 07:44:28PM +, Qing Zhao wrote:
>>> This is the 2nd version of the patch, per our discussion based on the
>>> review comments for the 1st version, the major changes in this version
>>> are:
>> 
>> Thanks for the update!
>> 
>>> 
>>> 1. change the name "element_count" to "counted_by";
>>> 2. change the parameter for the attribute from a STRING to an
>>> Identifier;
>>> 3. Add logic and testing cases to handle anonymous structure/unions;
>>> 4. Clarify documentation to permit the situation when the allocation
>>> size is larger than what's specified by "counted_by", at the same time,
>>> it's user's error if allocation size is smaller than what's specified by
>>> "counted_by";
>>> 5. Add a complete testing case for using counted_by attribute in
>>> __builtin_dynamic_object_size when there is mismatch between the
>>> allocation size and the value of "counted_by", the expecting behavior
>>> for each case and the explanation on why in the comments. 
>> 
>> All the "normal" test cases I have are passing; this is wonderful! :)
>> 
>> I'm still seeing unexpected situations when I've intentionally set
>> counted_by to be smaller than alloc_size, but I assume it's due to not
>> yet having the patch you mention below.
>> 
>>> As discussed, I plan to add two more separate patch sets after this initial
>>> patch set is approved and committed.
>>> 
>>> set 1. A new warning option and a new sanitizer option for the user error
>>>when the allocation size is smaller than the value of "counted_by".
>>> set 2. An improvement to __builtin_dynamic_object_size  for the following
>>>case:
>>> 
>>> struct A
>>> {
>>> size_t foo;
>>> int array[] __attribute__((counted_by (foo)));
>>> };
>>> 
>>> extern struct fix * alloc_buf ();
>>> 
>>> int main ()
>>> {
>>> struct fix *p = alloc_buf ();
>>> __builtin_object_size(p->array, 0) == sizeof(struct A) + p->foo * 
>>> sizeof(int);
>>>   /* with the current algorithm, it’s UNKNOWN */ 
>>> __builtin_object_size(p->array, 2) == sizeof(struct A) + p->foo * 
>>> sizeof(int);
>>>   /* with the current algorithm, it’s UNKNOWN */
>>> }
>> 
>> Should the above be bdos instead of bos?
>> 
>>> Bootstrapped and regression tested on both aarch64 and X86, no issue.
>> 
>> I've updated the Linux kernel's macros for the name change and done
>> build tests with my first pass at "easy" cases for adding counted_by:
>> https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=devel/counted_by=adc5b3cb48a049563dc673f348eab7b6beba8a9b
>> 
>> Everything is working as expected. :)
>> 
>> -Kees
>> 
> 
> -- 
> Univ.-Prof. Dr. rer. nat. Martin Uecker
> Graz University of Technology
> Institute of Biomedical Imaging



Re: [PATCH] testsuite: Fix gcc.dg/analyzer/allocation-size-multiline-[123].c [PR 110426]

2023-08-09 Thread David Malcolm via Gcc-patches
On Tue, 2023-08-08 at 15:01 +, Christophe Lyon wrote:
> For 32-bit newlib targets (e.g. arm-eabi)  int32_t is "long int".
> 
> Like previous patches in these tests, update the matching regexps to
> match "aka (long )?int".
> 
> Tested on arm-eabi and aarch64-linux-gnu.

Sorry about this breakage.

These tests used to emit the infomation as multiple messages, but were
consolidated as a side-effect of r14-3001-g021077b94741c9.

I've just committed r14-3114-g73da34a538ddc2, a cleanup of the analyzer
code, which has a side-effect of splitting the messages back up.  I
believe that r14-3114 restores these tests to their pre-r14-3001 state,
but I might have messed up.

Does r14-3114-g73da34a538ddc2 fix the issues for you, or is some
patching still needed?

Dave


> 
> 2023-08-08  Christophe Lyon  
> 
> gcc/testsuite/
> PR analyzer/110426
> * gcc.dg/analyzer/allocation-size-multiline-1.c: Handle
> int32_t being "long int".
> * gcc.dg/analyzer/allocation-size-multiline-2.c: Likewise.
> * gcc.dg/analyzer/allocation-size-multiline-3.c: Likewise.
> ---
>  gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-1.c | 6 +++-
> --
>  gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-2.c | 6 +++-
> --
>  gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-3.c | 4 ++--
>  3 files changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-
> 1.c b/gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-1.c
> index 9938ba237a0..b56e4b4e8e1 100644
> --- a/gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-1.c
> +++ b/gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-1.c
> @@ -16,7 +16,7 @@ void test_constant_1 (void)
>  |   int32_t *ptr = __builtin_malloc (1);
>  |  ^~~~
>  |  |
> -    |  (1) allocated 1 bytes and assigned to
> 'int32_t *' {aka 'int *'} here; 'sizeof (int32_t {aka int})' is '4'
> +    |  (1) allocated 1 bytes and assigned to
> 'int32_t *' {aka '{re:long :re?}int *'} here; 'sizeof (int32_t {aka
> {re:long :re?}int})' is '4'
>  |
>     { dg-end-multiline-output "" } */
>  
> @@ -34,7 +34,7 @@ void test_constant_2 (void)
>  |   int32_t *ptr = __builtin_malloc (2);
>  |  ^~~~
>  |  |
> -    |  (1) allocated 2 bytes and assigned to
> 'int32_t *' {aka 'int *'} here; 'sizeof (int32_t {aka int})' is '4'
> +    |  (1) allocated 2 bytes and assigned to
> 'int32_t *' {aka '{re:long :re?}int *'} here; 'sizeof (int32_t {aka
> {re:long :re?}int})' is '4'
>  |
>     { dg-end-multiline-output "" } */
>  
> @@ -52,6 +52,6 @@ void test_symbolic (int n)
>  |   int32_t *ptr = __builtin_malloc (n * 2);
>  |  ^~~~
>  |  |
> -    |  (1) allocated 'n * 2' bytes and assigned to
> 'int32_t *' {aka 'int *'} here; 'sizeof (int32_t {aka int})' is '4'
> +    |  (1) allocated 'n * 2' bytes and assigned to
> 'int32_t *' {aka '{re:long :re?}int *'} here; 'sizeof (int32_t {aka
> {re:long :re?}int})' is '4'
>  |
>     { dg-end-multiline-output "" } */
> diff --git a/gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-
> 2.c b/gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-2.c
> index 9e1269cbb7a..8912913a78c 100644
> --- a/gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-2.c
> +++ b/gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-2.c
> @@ -16,7 +16,7 @@ void test_constant_1 (void)
>  |   int32_t *ptr = __builtin_alloca (1);
>  |  ^~~~
>  |  |
> -    |  (1) allocated 1 bytes and assigned to
> 'int32_t *' {aka 'int *'} here; 'sizeof (int32_t {aka int})' is '4'
> +    |  (1) allocated 1 bytes and assigned to
> 'int32_t *' {aka '{re:long :re?}int *'} here; 'sizeof (int32_t {aka
> {re:long :re?}int})' is '4'
>  |
>     { dg-end-multiline-output "" } */
>  
> @@ -33,7 +33,7 @@ void test_constant_2 (void)
>  |   int32_t *ptr = __builtin_alloca (2);
>  |  ^~~~
>  |  |
> -    |  (1) allocated 2 bytes and assigned to
> 'int32_t *' {aka 'int *'} here; 'sizeof (int32_t {aka int})' is '4'
> +    |  (1) allocated 2 bytes and assigned to
> 'int32_t *' {aka '{re:long :re?}int *'} here; 'sizeof (int32_t {aka
> {re:long :re?}int})' is '4'
>  |
>     { dg-end-multiline-output "" } */
>  
> @@ -50,7 +50,7 @@ void test_symbolic (int n)
>  |   int32_t *ptr = __builtin_alloca (n * 2);
>  |  ^~~~
>  |  |
> -    |  (1) allocated 'n * 2' bytes and assigned to
> 'int32_t *' {aka 'int *'} here; 'sizeof (int32_t {aka int})' is '4'
> +    |  (1) 

Re: [PATCH 00/24] Sync shared build infrastructure with binutils-gdb

2023-08-09 Thread Joseph Myers
On Wed, 9 Aug 2023, Arsen Arsenović via Gcc-patches wrote:

> Joseph Myers  writes:
> 
> > On Tue, 8 Aug 2023, Arsen Arsenović via Gcc-patches wrote:
> >
> >> Yes.  Libtool was forked over a decade ago.  My next project is syncing
> >> upstream and us back up.  Unsure about pkg.m4.
> >
> > Note as per previous discussions that libtool commit 
> > 3334f7ed5851ef1e96b052f2984c4acdbf39e20c will need reverting in GCC when 
> > updating libtool because of incompatible usage of --with-sysroot.
> 
> I wanted to, somehow, coalesce the two back together, so that both are
> available.  Presumably, this means adding an option to libtool to accept
> host-sysroot or such, but I haven't done too much looking into this.
> 
> Is my interpretation of the issue correct?  (i.e. GCC uses sysroot to
> mean *target* sysroot rather than host sysroot)

GCC's sysroot is a sysroot containing target headers and libraries, yes.  
Also note:

* The --with-sysroot directory is the directory that would be used by GCC 
if installed in the configured --prefix (relocated if GCC ends up running 
from a different prefix).  At build time, --with-build-sysroot takes 
precedence.

* The --with-sysroot directory (and likewise the --with-build-sysroot 
directory or any directory specified with --sysroot= passed to GCC 
programs) may sometimes contain multiple sysroots in subdirectories; that 
directory gets combined with a suffix from SYSROOT_SUFFIX_SPEC to 
determine the actual subdirectory for the current multilib, within which 
there are subdirectories such as usr/include or usr/lib.

-- 
Joseph S. Myers
jos...@codesourcery.com


[Bug middle-end/110964] New: RISC-V vector ICE in expand_cond_len_ternop

2023-08-09 Thread jeremy.bennett at embecosm dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110964

Bug ID: 110964
   Summary: RISC-V vector ICE in expand_cond_len_ternop
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jeremy.bennett at embecosm dot com
  Target Milestone: ---

Created attachment 55713
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55713=edit
C source code of the bug reproducer

The following code (testcase.c) causes an ICE when using RISC-V vector as
target.

int *a;
long b, c;

int d ()
{
  const int e;
  for (; a < e; a++)
c += *a * b;
}

Compiled with:

riscv64-unknown-linux-gnu-gcc -march=rv64gcv -mabi=lp64d -c \
-Ofast --param=riscv-autovec-preference=scalable testcase.c

Output is:

testcase.c: In function 'd':
testcase.c:7:12: warning: comparison between pointer and integer
7 |   for (; a < e; a++)
  |^
during RTL pass: expand
testcase.c:8:7: internal compiler error: in expand_cond_len_ternop, at
config/riscv/riscv-v.cc:3579
8 | c += *a * b;
  |   ^~
0x93b0f3 riscv_vector::expand_cond_len_ternop(unsigned int, rtx_def**)
/home/jeremy/gittrees/mustang/gcc/gcc/config/riscv/riscv-v.cc:3579
0x1be5b61 gen_cond_len_fmarvvm1di(rtx_def*, rtx_def*, rtx_def*, rtx_def*,
rtx_def*, rtx_def*, rtx_def*, rtx_def*)
/home/jeremy/gittrees/mustang/gcc/gcc/config/riscv/autovec.md:1644
0xf41dd2 rtx_insn* insn_gen_fn::operator()(rtx_def*, rtx_def*, rtx_def*,
rtx_def*, rtx_def*, rtx_def*, rtx_def*, rtx_def*) const
/home/jeremy/gittrees/mustang/gcc/gcc/recog.h:411
0xf41dd2 maybe_gen_insn(insn_code, unsigned int, expand_operand*)
/home/jeremy/gittrees/mustang/gcc/gcc/optabs.cc:8237
0xf463cc maybe_expand_insn(insn_code, unsigned int, expand_operand*)
/home/jeremy/gittrees/mustang/gcc/gcc/optabs.cc:8265
0xf463cc expand_insn(insn_code, unsigned int, expand_operand*)
/home/jeremy/gittrees/mustang/gcc/gcc/optabs.cc:8296
0xdc6de0 expand_fn_using_insn
/home/jeremy/gittrees/mustang/gcc/gcc/internal-fn.cc:255
0xb72dbf expand_call_stmt
/home/jeremy/gittrees/mustang/gcc/gcc/cfgexpand.cc:2737
0xb72dbf expand_gimple_stmt_1
/home/jeremy/gittrees/mustang/gcc/gcc/cfgexpand.cc:3880
0xb72dbf expand_gimple_stmt
/home/jeremy/gittrees/mustang/gcc/gcc/cfgexpand.cc:4044
0xb77bc6 expand_gimple_basic_block
/home/jeremy/gittrees/mustang/gcc/gcc/cfgexpand.cc:6096
0xb79576 execute
/home/jeremy/gittrees/mustang/gcc/gcc/cfgexpand.cc:6831
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

System information
--

Using built-in specs.
COLLECT_GCC=riscv64-unknown-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/home/jeremy/gittrees/mustang/install/libexec/gcc/riscv64-unknown-linux-gnu/14.0.0/lto-wrapper
Target: riscv64-unknown-linux-gnu
Configured with: /home/jeremy/gittrees/mustang/gcc/configure
--target=riscv64-unknown-linux-gnu
--prefix=/home/jeremy/gittrees/mustang/install
--with-sysroot=/home/jeremy/gittrees/mustang/install/sysroot
--with-pkgversion=g5c27c911f6b --with-system-zlib --enable-shared --enable-tls
--enable-languages=c,c++,fortran --disable-libmudflap --disable-libssp
--disable-libquadmath --disable-libsanitizer --disable-nls --disable-bootstrap
--src=/home/jeremy/gittrees/mustang/gcc --enable-multilib --with-abi=lp64d
--with-arch=rv64gc --with-tune= --with-isa-spec=20191213 'CFLAGS_FOR_TARGET=-O2
   -mcmodel=medany' 'CXXFLAGS_FOR_TARGET=-O2-mcmodel=medany'
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 14.0.0 20230809 (experimental) (g5c27c911f6b)

[PATCH 3/3] testsuite: Use distinct explicit error codes in value_9.f90

2023-08-09 Thread Mikael Morin via Gcc-patches
Use distinct error codes, so that we can spot directly from the
testsuite log which case is failing.

gcc/testsuite/ChangeLog:

* gfortran.dg/value_9.f90 (val, val4, sub, sub4): Take the error
codes from the arguments.
(p): Update calls: pass explicit distinct error codes.
---
 gcc/testsuite/gfortran.dg/value_9.f90 | 108 +-
 1 file changed, 56 insertions(+), 52 deletions(-)

diff --git a/gcc/testsuite/gfortran.dg/value_9.f90 
b/gcc/testsuite/gfortran.dg/value_9.f90
index 1a2fa80ed0d..4813250ebaa 100644
--- a/gcc/testsuite/gfortran.dg/value_9.f90
+++ b/gcc/testsuite/gfortran.dg/value_9.f90
@@ -20,78 +20,82 @@ program p
   ! Check len=1 actual argument cases first
   ca  =   "a"; cp  =   "b"; cd  =   "c"
   ca4 = 4_"d"; cp4 = 4_"e"; cd4 = 4_"f"
-  call val  ("B","B")
-  call val  ("A",char(65))
-  call val  ("A",char(a))
-  call val  ("A",mychar(65))
-  call val  ("A",mychar(a))
-  call val  ("1",c)
-  call val  ("1",(c))
-  call val4 (4_"C",4_"C")
-  call val4 (4_"A",char(65,kind=4))
-  call val4 (4_"A",char(a, kind=4))
-  call val4 (4_"4",c4)
-  call val4 (4_"4",(c4))
-  call val  (ca,ca)
-  call val  (cp,cp)
-  call val  (cd,cd)
-  call val  (ca,(ca))
-  call val4 (ca4,ca4)
-  call val4 (cp4,cp4)
-  call val4 (cd4,cd4)
-  call val4 (cd4,(cd4))
-  call sub  ("S")
-  call sub4 (4_"T")
+  call val  ("B","B", 1, 2)
+  call val  ("A",char(65), 3, 4)
+  call val  ("A",char(a), 5, 6)
+  call val  ("A",mychar(65), 7, 8)
+  call val  ("A",mychar(a), 9, 10)
+  call val  ("1",c, 11, 12)
+  call val  ("1",(c), 13, 14)
+  call val4 (4_"C",4_"C", 15, 16)
+  call val4 (4_"A",char(65,kind=4), 17, 18)
+  call val4 (4_"A",char(a, kind=4), 19, 20)
+  call val4 (4_"4",c4, 21, 22)
+  call val4 (4_"4",(c4), 23, 24)
+  call val  (ca,ca, 25, 26)
+  call val  (cp,cp, 27, 28)
+  call val  (cd,cd, 29, 30)
+  call val  (ca,(ca), 31, 32)
+  call val4 (ca4,ca4, 33, 34)
+  call val4 (cp4,cp4, 35, 36)
+  call val4 (cd4,cd4, 37, 38)
+  call val4 (cd4,(cd4), 39, 40)
+  call sub  ("S", 41, 42)
+  call sub4 (4_"T", 43, 44)
 
   ! Check that always the first character of the string is finally used
-  call val  (  "U++",  "U--")
-  call val4 (4_"V**",4_"V//")
-  call sub  (  "WTY")
-  call sub4 (4_"ZXV")
-  call val  (  "234",  d)
-  call val4 (4_"345",  d4   )
-  call val  (  "234", (d)   )
-  call val4 (4_"345", (d4)  )
-  call val  (  "234",  d (1:2))
-  call val4 (4_"345",  d4(1:2))
-  call val  (  "234",  d (1:l))
-  call val4 (4_"345",  d4(1:l))
-  call val  ("1",c // d)
-  call val  ("1",trim (c // d))
-  call val4 (4_"4",c4 // d4)
-  call val4 (4_"4",trim (c4 // d4))
+  call val  (  "U++",  "U--", 45, 46)
+  call val4 (4_"V**",4_"V//", 47, 48)
+  call sub  (  "WTY", 49, 50)
+  call sub4 (4_"ZXV", 51, 52)
+  call val  (  "234",  d, 53, 54)
+  call val4 (4_"345",  d4   , 55, 56)
+  call val  (  "234", (d)   , 57, 58)
+  call val4 (4_"345", (d4)  , 59, 60)
+  call val  (  "234",  d (1:2), 61, 62)
+  call val4 (4_"345",  d4(1:2), 63, 64)
+  call val  (  "234",  d (1:l), 65, 66)
+  call val4 (4_"345",  d4(1:l), 67, 68)
+  call val  ("1",c // d, 69, 70)
+  call val  ("1",trim (c // d), 71, 72)
+  call val4 (4_"4",c4 // d4, 73, 74)
+  call val4 (4_"4",trim (c4 // d4), 75, 76)
   cd = "gkl"; cd4 = 4_"hmn"
-  call val  (cd,cd)
-  call val4 (cd4,cd4)
-  call sub  (cd)
-  call sub4 (cd4)
+  call val  (cd,cd, 77, 78)
+  call val4 (cd4,cd4, 79, 80)
+  call sub  (cd, 81, 82)
+  call sub4 (cd4, 83, 84)
   deallocate (ca, cp, ca4, cp4, cd, cd4)
 contains
-  subroutine val (x, c)
+  subroutine val (x, c, err1, err2)
 character(kind=1), intent(in) :: x  ! control: pass by reference
 character(kind=1), value  :: c
+integer, intent(in) :: err1, err2
 print *, "by value(kind=1): ", c
-if (c /= x)   stop 1
+if (c /= x)   stop err1
 c = "*"
-if (c /= "*") stop 2
+if (c /= "*") stop err2
   end
 
-  subroutine val4 (x, c)
+  subroutine val4 (x, c, err1, err2)
 character(kind=4), intent(in) :: x  ! control: pass by reference
 character(kind=4), value  :: c
+integer, intent(in) :: err1, err2
 print *, "by value(kind=4): ", c
-if (c /= x) stop 3
+if (c /= x) stop err1
 c = 4_"#"
-if (c /= 4_"#") stop 4
+if (c /= 4_"#") stop err2
   end
 
-  subroutine sub (s)
+  subroutine sub (s, err1, err2)
 character(*), intent(in) :: s
-call val (s, s)
+integer,  intent(in) :: err1, err2
+call val (s, s, err1, err2)
   end
-  subroutine sub4 (s)
+  subroutine sub4 (s, err1, err2)
 character(kind=4,len=*), intent(in) :: s
-call val4 (s, s)
+integer, intent(in) :: err1, err2
+call val4 (s, s, err1, err2)
   end
 
   character function mychar (i)
-- 
2.40.1



[PATCH 1/3] fortran: New predicate gfc_length_one_character_type_p

2023-08-09 Thread Mikael Morin via Gcc-patches
Introduce a new predicate to simplify conditionals checking for
a character type whose length is the constant one.

gcc/fortran/ChangeLog:

* gfortran.h (gfc_length_one_character_type_p): New inline
function.
* check.cc (is_c_interoperable): Use
gfc_length_one_character_type_p.
* decl.cc (verify_bind_c_sym): Same.
* trans-expr.cc (gfc_conv_procedure_call): Same.
---
 gcc/fortran/check.cc  |  7 +++
 gcc/fortran/decl.cc   |  4 +---
 gcc/fortran/gfortran.h| 15 +++
 gcc/fortran/trans-expr.cc |  8 ++--
 4 files changed, 21 insertions(+), 13 deletions(-)

diff --git a/gcc/fortran/check.cc b/gcc/fortran/check.cc
index 4086dc71d34..6c45e6542f0 100644
--- a/gcc/fortran/check.cc
+++ b/gcc/fortran/check.cc
@@ -5250,10 +5250,9 @@ is_c_interoperable (gfc_expr *expr, const char **msg, 
bool c_loc, bool c_f_ptr)
&& !gfc_simplify_expr (expr->ts.u.cl->length, 0))
   gfc_internal_error ("is_c_interoperable(): gfc_simplify_expr failed");
 
-if (!c_loc && expr->ts.u.cl
-   && (!expr->ts.u.cl->length
-   || expr->ts.u.cl->length->expr_type != EXPR_CONSTANT
-   || mpz_cmp_si (expr->ts.u.cl->length->value.integer, 1) != 0))
+if (!c_loc
+   && expr->ts.u.cl
+   && !gfc_length_one_character_type_p (>ts))
   {
*msg = "Type shall have a character length of 1";
return false;
diff --git a/gcc/fortran/decl.cc b/gcc/fortran/decl.cc
index 844345df77e..8182ef29f43 100644
--- a/gcc/fortran/decl.cc
+++ b/gcc/fortran/decl.cc
@@ -6064,9 +6064,7 @@ verify_bind_c_sym (gfc_symbol *tmp_sym, gfc_typespec *ts,
 
   /* BIND(C) functions cannot return a character string.  */
   if (bind_c_function && tmp_sym->ts.type == BT_CHARACTER)
-   if (tmp_sym->ts.u.cl == NULL || tmp_sym->ts.u.cl->length == NULL
-   || tmp_sym->ts.u.cl->length->expr_type != EXPR_CONSTANT
-   || mpz_cmp_si (tmp_sym->ts.u.cl->length->value.integer, 1) != 0)
+   if (!gfc_length_one_character_type_p (_sym->ts))
  gfc_error ("Return type of BIND(C) function %qs of character "
 "type at %L must have length 1", tmp_sym->name,
 &(tmp_sym->declared_at));
diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index 6482a885211..d44e5286626 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -3181,6 +3181,21 @@ gfc_finalizer;
 
 / Function prototypes */
 
+
+/* Returns true if the type specified in TS is a character type whose length
+   is the constant one.  Otherwise returns false.  */
+
+inline bool
+gfc_length_one_character_type_p (gfc_typespec *ts)
+{
+  return ts->type == BT_CHARACTER
+&& ts->u.cl
+&& ts->u.cl->length
+&& ts->u.cl->length->expr_type == EXPR_CONSTANT
+&& ts->u.cl->length->ts.type == BT_INTEGER
+&& mpz_cmp_ui (ts->u.cl->length->value.integer, 1) == 0;
+}
+
 /* decl.cc */
 bool gfc_in_match_data (void);
 match gfc_match_char_spec (gfc_typespec *);
diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc
index ef3e6d08f78..6da3975f77c 100644
--- a/gcc/fortran/trans-expr.cc
+++ b/gcc/fortran/trans-expr.cc
@@ -6453,12 +6453,8 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
   dummy arguments are actually passed by value.
   Strings are truncated to length 1.
   The BIND(C) case is handled elsewhere.  */
-   if (fsym->ts.type == BT_CHARACTER
-   && !fsym->ts.is_c_interop
-   && fsym->ts.u.cl->length->expr_type == EXPR_CONSTANT
-   && fsym->ts.u.cl->length->ts.type == BT_INTEGER
-   && (mpz_cmp_ui
-   (fsym->ts.u.cl->length->value.integer, 1) == 0))
+   if (!fsym->ts.is_c_interop
+   && gfc_length_one_character_type_p (>ts))
  {
if (e->expr_type != EXPR_CONSTANT)
  {
-- 
2.40.1



[PATCH 2/3] fortran: Fix length one character dummy arg type [PR110419]

2023-08-09 Thread Mikael Morin via Gcc-patches
Revision r14-2171-g8736d6b14a4dfdfb58c80ccd398981b0fb5d00aa
changed the argument passing convention for length 1 value dummy
arguments to pass just the single character by value.  However, the
procedure declarations weren't updated to reflect the change in the
argument types.
This change does the missing argument type update.

The change of argument types generated an internal error in
gfc_conv_string_parameter with value_9.f90.  Indeed, that function is
not prepared for bare character type, so it is updated as well.

The condition guarding the single character argument passing code
is loosened to not exclude non-interoperable kind (this fixes
a regression with c_char_tests_2.f03).

Finally, the constant string argument passing code is updated as well
to extract the single char and pass it instead of passing it as
a length one string.  As the code taking care of non-constant arguments
was already doing this, the condition guarding it is just removed.

With these changes, value_9.f90 passes on 32 bits big-endian powerpc.

PR fortran/110360
PR fortran/110419

gcc/fortran/ChangeLog:

* trans-types.cc (gfc_sym_type): Use a bare character type for length
one value character dummy arguments.
* trans-expr.cc (gfc_conv_string_parameter): Handle single character
case.
(gfc_conv_procedure_call): Don't exclude interoperable kinds
from single character handling.  For single character dummy arguments,
extend the existing handling of non-constant expressions to constant
expressions.

gcc/testsuite/ChangeLog:

* gfortran.dg/bind_c_usage_13.f03: Update tree dump patterns.
---
 gcc/fortran/trans-expr.cc | 35 +++
 gcc/fortran/trans-types.cc|  5 ++-
 gcc/testsuite/gfortran.dg/bind_c_usage_13.f03 |  8 ++---
 3 files changed, 28 insertions(+), 20 deletions(-)

diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc
index 6da3975f77c..d91cc9da221 100644
--- a/gcc/fortran/trans-expr.cc
+++ b/gcc/fortran/trans-expr.cc
@@ -6451,26 +6451,24 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
 
/* ABI: actual arguments to CHARACTER(len=1),VALUE
   dummy arguments are actually passed by value.
-  Strings are truncated to length 1.
-  The BIND(C) case is handled elsewhere.  */
-   if (!fsym->ts.is_c_interop
-   && gfc_length_one_character_type_p (>ts))
+  Strings are truncated to length 1.  */
+   if (gfc_length_one_character_type_p (>ts))
  {
-   if (e->expr_type != EXPR_CONSTANT)
- {
-   tree slen1 = build_int_cst (gfc_charlen_type_node, 
1);
-   gfc_conv_string_parameter ();
-   parmse.expr = gfc_string_to_single_character (slen1,
- 
parmse.expr,
- 
e->ts.kind);
-   /* Truncate resulting string to length 1.  */
-   parmse.string_length = slen1;
- }
-   else if (e->value.character.length > 1)
+   if (e->expr_type == EXPR_CONSTANT
+   && e->value.character.length > 1)
  {
e->value.character.length = 1;
gfc_conv_expr (, e);
  }
+
+   tree slen1 = build_int_cst (gfc_charlen_type_node, 1);
+   gfc_conv_string_parameter ();
+   parmse.expr
+   = gfc_string_to_single_character (slen1,
+ parmse.expr,
+ e->ts.kind);
+   /* Truncate resulting string to length 1.  */
+   parmse.string_length = slen1;
  }
 
if (fsym->attr.optional
@@ -10610,6 +10608,13 @@ gfc_conv_string_parameter (gfc_se * se)
 {
   tree type;
 
+  if (TREE_CODE (TREE_TYPE (se->expr)) == INTEGER_TYPE
+  && integer_onep (se->string_length))
+{
+  se->expr = gfc_build_addr_expr (NULL_TREE, se->expr);
+  return;
+}
+
   if (TREE_CODE (se->expr) == STRING_CST)
 {
   type = TREE_TYPE (TREE_TYPE (se->expr));
diff --git a/gcc/fortran/trans-types.cc b/gcc/fortran/trans-types.cc
index 987e3d26c46..084b8c3ae2c 100644
--- a/gcc/fortran/trans-types.cc
+++ b/gcc/fortran/trans-types.cc
@@ -2313,7 +2313,10 @@ gfc_sym_type (gfc_symbol * sym, bool is_bind_c)
  && sym->ns->proc_name
  && sym->ns->proc_name->attr.is_bind_c)
  || 

[PATCH 0/3] fortran: fix length one character dummy args [PR110419]

2023-08-09 Thread Mikael Morin via Gcc-patches
Hello,

I propose with this patchset a fix for the test value_9.f90 which has been
failing on 32 bits powerpc since it was added a few weeks back (see PR110360
and PR110419).

The problem is an argument type mismatch between a procedure declaration,
and the argument value for a call of that same procedure, in the specific
case of length one character dummy arguments with the value attribute.
Admittedly, our argument passing conventions [1] for those are currently
unspecified.

Before PR110360, character dummy arguments with value attribute were
arrays passed by value, but the actual argument was still passed as
reference.  PR110360 changed that to pass length one dummies as bare
character (i.e. scalar integer), like in the bind(c) case (but with length
argument still present).  However, the argument type in the function declaration
wasn't changed at the same time, so the test was failing on big-endian 32 bits
targets.  Surprisingly, on most targets the middle-end, back-end and runtime
are happy to get a scalar value passed where a length one array is expected.

This can be fixed, either by reverting back to arguments represented as
arrays passed by value with calls fixed, or by keeping the new
representation with single characters for arguments and fixing the procedure
types accordingly.

I haven't really tried the first way, this is using the second one.
The first patch is a preliminary refactoring.  The main change is the
second patch.  It modifies the types of length one character dummy argsuments
with value attribute in the procedure declarations, so that they are scalar
integer types, consistently with how arguments are passed for calls.
The third patch is a change of error codes in the testcase.

I have regression tested this on x86_64-unknown-linux-gnu, and
powerpc64-unknown-linux-gnu (both -m32 and -m64).
OK for master?

[1] https://gcc.gnu.org/onlinedocs/gfortran/Argument-passing-conventions.html


Mikael Morin (3):
  fortran: New predicate gfc_length_one_character_type_p
  fortran: Fix length one character dummy arg type [PR110419]
  testsuite: Use distinct explicit error codes in value_9.f90

 gcc/fortran/check.cc  |   7 +-
 gcc/fortran/decl.cc   |   4 +-
 gcc/fortran/gfortran.h|  15 +++
 gcc/fortran/trans-expr.cc |  39 ---
 gcc/fortran/trans-types.cc|   5 +-
 gcc/testsuite/gfortran.dg/bind_c_usage_13.f03 |   8 +-
 gcc/testsuite/gfortran.dg/value_9.f90 | 108 +-
 7 files changed, 103 insertions(+), 83 deletions(-)

-- 
2.40.1



[pushed] analyzer: remove default return value from region_model::on_call_pre

2023-08-09 Thread David Malcolm via Gcc-patches
Previously, the code for simulating calls to external functions in
region_model::on_call_pre wrote a default svalue to the LHS of the
call statement, which could be further overwritten by known_function
subclasses.

Unfortunately, this led to messy hacks, such as when the default svalue
was an allocation: the LHS would be written to with two different
heap-allocated regions, requiring special-case cleanups to avoid the
stray state from the first heap allocation leading to state explosions;
see r14-3001-g021077b94741c9.

The following patch eliminates this write of a default svalue to the LHS
of callsite.  Instead, all known_function implementations that have a
return value are now responsible for set the LHS themselves.  A new
call_details::set_any_lhs_with_defaults function is provided to make it
easy to get the old behavior.

On working through the various known_function subclasses, I noticed that
memset was using the default behavior.  That patch updates this so that
it's now known to return its first parameter.

Cleaning this up eliminates various doubling of saved_diagnostics (e.g.
for dubious_allocation_size) where it was generating a diagnostic for
both writes to the LHS, deduplicating them to the first diagnostic (with
the default LHS), and then failing to create a region_creation_event
when emitting the diagnostic, leading to the fallback wording in
dubious_allocation_size::describe_final_event, such as:

  (1) allocated 42 bytes and assigned to ‘int32_t *’ {aka ‘int *’} here; 
‘sizeof (int32_t {aka int})’ is ‘4’

Without the double write to the LHS, it creates a region_creation_event,
so we get the allocation and the assignment as two separate events in
the diagnostic path, e.g.:

  (1) allocated 42 bytes here
  (2) assigned to ‘int32_t *’ {aka ‘int *’} here; ‘sizeof (int32_t {aka int})’ 
is ‘4’

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r14-3114-g73da34a538ddc2.

gcc/analyzer/ChangeLog:
* analyzer.h (class pure_known_function_with_default_return): New
subclass.
* call-details.cc (const_fn_p): Move here from region-model.cc.
(maybe_get_const_fn_result): Likewise.
(get_result_size_in_bytes): Likewise.
(call_details::set_any_lhs_with_defaults): New function, based on
code in region_model::on_call_pre.
* call-details.h (call_details::set_any_lhs_with_defaults): New
decl.
* diagnostic-manager.cc
(diagnostic_manager::emit_saved_diagnostic): Log the index of the
saved_diagnostic.
* kf.cc (pure_known_function_with_default_return::impl_call_pre):
New.
(kf_memset::impl_call_pre): Set the LHS to the first param.
(kf_putenv::impl_call_pre): Call cd.set_any_lhs_with_defaults.
(kf_sprintf::impl_call_pre): Call cd.set_any_lhs_with_defaults.
(class kf_stack_restore): Derive from
pure_known_function_with_default_return.
(class kf_stack_save): Likewise.
(kf_strlen::impl_call_pre): Call cd.set_any_lhs_with_defaults.
* region-model-reachability.cc (reachable_regions::handle_sval):
Remove logic for symbolic regions for pointers.
* region-model.cc (region_model::canonicalize): Remove purging of
dynamic extents workaround for surplus values from
region_model::on_call_pre's default LHS code.
(const_fn_p): Move to call-details.cc.
(maybe_get_const_fn_result): Likewise.
(get_result_size_in_bytes): Likewise.
(region_model::update_for_nonzero_return): Call
cd.set_any_lhs_with_defaults.
(region_model::on_call_pre): Remove the assignment to the LHS of a
default return value, instead requiring all known_function
implementations to write to any LHS of the call.  Use
cd.set_any_lhs_with_defaults on the non-kf paths.
* sm-fd.cc (kf_socket::outcome_of_socket::update_model): Use
cd.set_any_lhs_with_defaults when failing to get at fd state.
(kf_bind::outcome_of_bind::update_model): Likewise.
(kf_listen::outcome_of_listen::update_model): Likewise.
(kf_accept::outcome_of_accept::update_model): Likewise.
(kf_connect::outcome_of_connect::update_model): Likewise.
(kf_read::impl_call_pre): Use cd.set_any_lhs_with_defaults.
* sm-file.cc (class kf_stdio_output_fn): Derive from
pure_known_function_with_default_return.
(class kf_ferror): Likewise.
(class kf_fileno): Likewise.
(kf_fgets::impl_call_pre): Use cd.set_any_lhs_with_defaults.
(kf_read::impl_call_pre): Likewise.
(class kf_getc): Derive from
pure_known_function_with_default_return.
(class kf_getchar): Likewise.
* varargs.cc (kf_va_arg::impl_call_pre): Use
cd.set_any_lhs_with_defaults.

gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/allocation-size-1.c: Update expected results
to reflect splitting of allocation size 

Re: [PATCH 3/12] phiopt: Fix phiopt ICE on vops [PR102989]

2023-08-09 Thread Jakub Jelinek via Gcc-patches
On Wed, Aug 09, 2023 at 01:06:22PM -0700, Andrew Pinski wrote:
> That was my next thought, returning NULL from
> single_non_singleton_phi_for_edges if it would return a virtual OP
> might be better even.
> Either version of these patches are ok with me (though I am not the
> maintainer here).

In patch form that would be (but so far untested):

2023-08-09  

PR c/102989
* tree-ssa-phiopt.cc (single_non_singleton_phi_for_edges): Never
return virtual phis and return NULL if there is a virtual phi
where the arguments from E0 and E1 edges aren't equal.

--- gcc/tree-ssa-phiopt.cc.jj   2023-08-09 22:08:07.974563266 +0200
+++ gcc/tree-ssa-phiopt.cc  2023-08-09 22:11:37.291517911 +0200
@@ -63,7 +63,13 @@ single_non_singleton_phi_for_edges (gimp
   gimple_stmt_iterator i;
   gphi *phi = NULL;
   if (gimple_seq_singleton_p (seq))
-return as_a  (gsi_stmt (gsi_start (seq)));
+{
+  phi = as_a  (gsi_stmt (gsi_start (seq)));
+  /* Never return virtual phis.  */
+  if (virtual_operand_p (gimple_phi_result (phi)))
+   return NULL;
+  return phi;
+}
   for (i = gsi_start (seq); !gsi_end_p (i); gsi_next ())
 {
   gphi *p = as_a  (gsi_stmt (i));
@@ -72,6 +78,10 @@ single_non_singleton_phi_for_edges (gimp
   gimple_phi_arg_def (p, e1->dest_idx)))
continue;
 
+  /* Punt on virtual phis with different arguments from the edges.  */
+  if (virtual_operand_p (gimple_phi_result (p)))
+   return NULL;
+
   /* If we already have a PHI that has the two edge arguments are
 different, then return it is not a singleton for these PHIs. */
   if (phi)


Jakub



Re: [RFC] GCC Security policy

2023-08-09 Thread Siddhesh Poyarekar

On 2023-08-09 14:17, David Edelsohn wrote:
On Wed, Aug 9, 2023 at 1:33 PM Siddhesh Poyarekar > wrote:


On 2023-08-08 10:30, Siddhesh Poyarekar wrote:
 >> Do you have a suggestion for the language to address libgcc,
 >> libstdc++, etc. and libiberty, libbacktrace, etc.?
 >
 > I'll work on this a bit and share a draft.

Hi David,

Here's what I came up with for different parts of GCC, including the
runtime libraries.  Over time we may find that specific parts of
runtime
libraries simply cannot be used safely in some contexts and flag that.

Sid


Hi, Sid

Thanks for iterating on this.


"""
What is a GCC security bug?
===

      A security bug is one that threatens the security of a system or
      network, or might compromise the security of data stored on it.
      In the context of GCC there are multiple ways in which this might
      happen and they're detailed below.

Compiler drivers, programs, libgccjit and support libraries
---

      The compiler driver processes source code, invokes other programs
      such as the assembler and linker and generates the output result,
      which may be assembly code or machine code.  It is necessary that
      all source code inputs to the compiler are trusted, since it is
      impossible for the driver to validate input source code beyond
      conformance to a programming language standard.

      The GCC JIT implementation, libgccjit, is intended to be plugged
      into applications to translate input source code in the
application
      context.  Limitations that apply to the compiler
      driver, apply here too in terms of sanitizing inputs, so it is
      recommended that inputs are either sanitized by an external
program
      to allow only trusted, safe execution in the context of the
      application or the JIT execution context is appropriately
sandboxed
      to contain the effects of any bugs in the JIT or its generated
code
      to the sandboxed environment.

      Support libraries such as libiberty, libcc1 libvtv and libcpp have
      been developed separately to share code with other tools such as
      binutils and gdb.  These libraries again have similar
challenges to
      compiler drivers.  While they are expected to be robust against
      arbitrary input, they should only be used with trusted inputs.

      Libraries such as zlib and libffi that bundled into GCC to
build it
      will be treated the same as the compiler drivers and programs
as far
      as security coverage is concerned.


Should we direct people to the upstream projects for their security 
policies?


We bundle zlib and libffi so regardless of whether it's a security issue 
in those libraries (because security impact of memory safety bugs in 
general use libraries will be context dependent and hence get assigned 
CVEs more often than not), the context in gcc is well defined as a local 
unprivileged executable and hence not security-relevant.


That said, we could add something like:

However if you find a issue in these libraries independent of their
use in GCC you should reach out to their upstream projects to report
them.




      As a result, the only case for a potential security issue in all
      these cases is when it ends up generating vulnerable output for
      valid input source code.


Language runtime libraries
--

      GCC also builds and distributes libraries that are intended to be
      used widely to implement runtime support for various programming
      languages.  These include the following:

      * libada
      * libatomic
      * libbacktrace
      * libcc1
      * libcody
      * libcpp
      * libdecnumber
      * libgcc
      * libgfortran
      * libgm2
      * libgo
      * libgomp
      * libiberty
      * libitm
      * libobjc
      * libphobos
      * libquadmath
      * libssp
      * libstdc++

      These libraries are intended to be used in arbitrary contexts
and as
      a result, bugs in these libraries may be evaluated for security
      impact.  However, some of these libraries, e.g. libgo, libphobos,
      etc.  are not maintained in the GCC project, due to which the GCC
      project may not be the correct point of contact for them.  You are
      encouraged to look at README files within those library
directories
      to locate the canonical security contact point for those projects.


As Richard mentioned, should GCC make a specific statement about the 
security policy / response for issues that are discovered and fixed in 
the upstream 

Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-08-09 Thread Qing Zhao via Gcc-patches


> On Aug 9, 2023, at 12:21 PM, Michael Matz  wrote:
> 
> Hello,
> 
> On Wed, 9 Aug 2023, Qing Zhao wrote:
> 
>> Although this is an old FAM related issue that does not relate to my current 
>> patch 
>> (and might need to be resolved in a separate patch).  I think that it’s 
>> necessary to have
>> more discussion on this old issue and resolve it. 
>> 
>> The first thing that I’d like to confirm is:
>> 
>> What the exact memory layout for the following structure x?
>> 
>> struct foo { int a; short b; char t[]; } x = { .t = { 1, 2, 3 } };
>> 
>> And the key that is confusing me is, where should the field “t” start? 
>> 
>> A.  Starting at offset 8 as the following:
>> 
>> a4-bytes
>> b2-bytes
>> padding   2-bytes
>> t3-bytes
> 
> Why should there be padding before 't'?  It's a char array (FAM or not), 
> so it can be (and should be) placed directly after 'b'.  So ...

Yes, you are right. My mistake -:)
> 
>> B. Starting at offset 6 as the following:
>> 
>> a4-bytes
>> b2-bytes
>> t3-bytes
> 
> ... this is the correct layout, when seen in isolation.  The discussion 
> revolves around what should come after 't': if it's a non-FAM struct (with 
> t[3]), then it's clear that there needs to be padding after it, so to pad 
> out the whole struct to be 12 bytes long (for sizeof() purpose), as 
> required by its alignment (due to the int field 'a').

right.
> 
> So, should the equivalent FAM struct also have this sizeof()?  If no: 
> there should be a good argument why it shouldn't be similar to the non-FAM 
> one.

The sizeof() of a structure with FAM is defined as: (after I searched online,
 I think that the one in Wikipedia is the most reasonable one):
https://en.wikipedia.org/wiki/Flexible_array_member

==
Effect on struct size and padding
The sizeof operator on such a struct gives the size of the structure as if the 
flexible 
array member were empty. This may include padding added to accommodate the
 flexible member; the compiler is also free to re-use such padding as part of 
the array itself.[2]

It is common to allocate sizeof(struct) + array_len*sizeof(array element) 
bytes. 

This is not wrong, but it may allocate a few more bytes than necessary: the 
compiler
 may be re-purposing some of the padding that is included in sizeof(struct). 
Should 
this be a concern, macros are available[3] to compute the minimum size while 
ensuring that the compiler's padding is not disrupted. 

As the array may start in the padding before the end of the structure, its 
content
 should always be accessed via indexing (arr[i]) or  offsetof, not sizeof.
==

By definition, the sizeof() of a struct with FAM might not be the same as the 
non-FAM one. 
i.e, for the following two structures, one with FAM, the other with fixed array:

struct foo_flex { int a; short b; char t[]; } x = { .t = { 1, 2, 3 } };
struct foo_fix {int a; short b; char t[3]; } 

With current GCC:
sizeof(foo_flex) == 8
sizeof(foo_fix) == 12

I think that the current behavior of sizeof for structure with FAM in GCC is 
correct. 

The major issue is what was pointed out by Martin in the previous email:

Whether using the following formula is correct to compute the allocation?

sizeof(struct foo_flex) + N * sizeof(foo->t);

As pointed out  in the wikipedia, the value computed by this formula might
 be bigger than the actual size since “sizeof(struct foo_flex)” might include 
paddings that are used as part of the array.

So the more accurate formula should be

offset(struct foo_flex, t[0]) + N * sizeof(foo->t);

With GCC, offset(struct foo_flex,t[0]) == 6, which is also correct. 


> Then there is an argument that the compiler would be fine, when allocating 
> a single object of such type (not as part of an array!), to only reserve 9 
> bytes of space for the FAM-struct.  Then the question is: should it also 
> do that for a non-FAM struct (based on the argument that the padding 
> behind 't' isn't accessible, and hence doesn't need to be alloced).  I 
> think it would be quite unexpected for the compiler to actually reserve 
> less space than sizeof() implies, so I personally don't buy that argument.  
> For FAM or non-FAM structs.

For the question, whether the compiler needs to allocate paddings after the FAM 
field,
 I don’t know the answer, and it’s not specified in the standard either. 
Does it matter?

> 
> Note that if one choses to allocate less space than sizeof implies that 
> this will have quite some consequences for code generation, in that 
> sometimes the instruction sequences (e.g. for copying) need to be careful 
> to never access tail padding that should be there in array context, but 
> isn't there in single-object context.  I think this alone should make it 
> clear that it's advisable that sizeof() and allocated size agree.

Sizeof by definition is return the size of the TYPE, not the size of the 
allocated object.

Another thing I need to point out is,  a 

Re: [PATCH 3/12] phiopt: Fix phiopt ICE on vops [PR102989]

2023-08-09 Thread Andrew Pinski via Gcc-patches
On Wed, Aug 9, 2023 at 1:01 PM Jakub Jelinek  wrote:
>
> On Wed, Aug 09, 2023 at 11:27:48AM -0700, Andrew Pinski wrote:
> > Maybe it is better to punt for VOPS after the call to
> > single_non_singleton_phi_for_edges since none of functions called
> > afterwards support VOPs.
> > That is something like:
> > diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
> > index ff36bb0119b..d0b659042a7 100644
> > --- a/gcc/tree-ssa-phiopt.cc
> > +++ b/gcc/tree-ssa-phiopt.cc
> > @@ -4165,6 +4165,10 @@ pass_phiopt::execute (function *)
> >arg0 = gimple_phi_arg_def (phi, e1->dest_idx);
> >arg1 = gimple_phi_arg_def (phi, e2->dest_idx);
> >
> > +  /* Can't do anything with a VOP here.  */
> > +  if (SSA_NAME_IS_VIRTUAL_OPERAND (arg0))
> > +   continue;
> > +
>
> That would ICE if arg0 isn't SSA_NAME (e.g. is INTEGER_CST).
> I think more canonical test for virtual phis is
> if (virtual_operand_p (gimple_phi_result (phi)))
>
> Shall already single_non_singleton_phi_for_edges punt if there is
> a virtual phi with different arguments from the edges (or if there
> is a single virtual phi)?

That was my next thought, returning NULL from
single_non_singleton_phi_for_edges if it would return a virtual OP
might be better even.
Either version of these patches are ok with me (though I am not the
maintainer here).

Thanks,
Andrew

>
> Jakub
>


Re: [RFC PATCH 0/2] RISC-V: __builtin_riscv_pause for all environment

2023-08-09 Thread Jeff Law via Gcc-patches




On 8/9/23 00:11, Tsukasa OI via Gcc-patches wrote:

Hello,

I found that a built-in function "__builtin_riscv_pause" is not usable
unless we enable the 'Zihintpause' extension explicitly (still, this
built-in exists EVEN IF the 'Zihintpause' extension is disabled).

Contents of a.c:


void sample(void)
{
 __builtin_riscv_pause();
}



Compiling with the 'Zihintpause' extension is fine.


$ riscv64-unknown-elf-gcc -O2 -march=rv64i_zihintpause -mabi=lp64 -c a.c



However, compiling without the 'Zihintpause' causes an assembler error
(tested with GNU Binutils 2.41):


$ riscv64-unknown-elf-gcc -O2 -march=rv64i -mabi=lp64 -c a.c
/tmp/ccFjacAz.s: Assembler messages:
/tmp/ccFjacAz.s:11: Error: unrecognized opcode `pause', extension `zihintpause' 
required



This is because:

1.  GCC does not handle the 'Zihintpause' extension and
2.  "riscv_pause" (insn) unconditionally emits "pause" even if the
 assembler does not accept it (when the extension is disabled).


This patch set (PATCH 1/2) resolves this issue by:

1.  Handling the 'Zihintpause' extension and
2.  Splitting the "__builtin_riscv_pause" implementation
 depending on the existence of the 'Zihintpause' extension.

Because a released version of GCC defines "__builtin_riscv_pause"
unconditionally, I chose to define no-'Zihintpause' version.

There is another option to unconditionally emit ".insn 0x010f"
(the machine code of "pause") but I didn't because I wanted to improve the
diagnostics (e.g. *.s output).

I also fixed the description of this built-in function (in PATCH 2/2).


I'm not sure whether this is a good method to split the implementation
depending on the 'Zihintpause' extension.  Other than that, I believe that
this is okay and approval is appreciated.

Note that because I assigned copyright of GCC contribution to FSF, I didn't
attach "Signed-off-by" tag.  Tell me if I need it.
I'd tend to think we do not want to expose the intrinsic unless the 
right extensions are enabled -- even though the encoding is a no-op and 
we could emit it as a .insn.


If others think otherwise, I'll go with the consensus opinion.  So let's 
hold off a bit until others have chimed in.


Thanks,
jeff


Re: [PATCH 3/12] phiopt: Fix phiopt ICE on vops [PR102989]

2023-08-09 Thread Jakub Jelinek via Gcc-patches
On Wed, Aug 09, 2023 at 11:27:48AM -0700, Andrew Pinski wrote:
> Maybe it is better to punt for VOPS after the call to
> single_non_singleton_phi_for_edges since none of functions called
> afterwards support VOPs.
> That is something like:
> diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
> index ff36bb0119b..d0b659042a7 100644
> --- a/gcc/tree-ssa-phiopt.cc
> +++ b/gcc/tree-ssa-phiopt.cc
> @@ -4165,6 +4165,10 @@ pass_phiopt::execute (function *)
>arg0 = gimple_phi_arg_def (phi, e1->dest_idx);
>arg1 = gimple_phi_arg_def (phi, e2->dest_idx);
> 
> +  /* Can't do anything with a VOP here.  */
> +  if (SSA_NAME_IS_VIRTUAL_OPERAND (arg0))
> +   continue;
> +

That would ICE if arg0 isn't SSA_NAME (e.g. is INTEGER_CST).
I think more canonical test for virtual phis is
if (virtual_operand_p (gimple_phi_result (phi)))

Shall already single_non_singleton_phi_for_edges punt if there is
a virtual phi with different arguments from the edges (or if there
is a single virtual phi)?

Jakub



Re: [PATCH] RISC-V: Remove non-existing 'Zve32d' extension

2023-08-09 Thread Jeff Law via Gcc-patches




On 8/9/23 00:09, Tsukasa OI via Gcc-patches wrote:

Since this extension does not exist, this commit prunes this from
the defined extension version table.

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc(riscv_ext_version_table):
Remove 'Zve32d' from the version list.

Thanks.  Installed.
jeff


Re: [PATCH] RISC-V: Handle no_insn in TARGET_SCHED_VARIABLE_ISSUE.

2023-08-09 Thread Jeff Law via Gcc-patches



On 5/29/23 06:46, Jeff Law wrote:



On 5/29/23 05:01, Jin Ma wrote:
Reference: 
https://github.com/gcc-mirror/gcc/commit/d0bc0cb66bcb0e6a5a5a31a9e900e8ccc98e34e5


RISC-V should also be implemented to handle no_insn patterns for 
pipelining.


gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_sched_variable_issue): New function.
(TARGET_SCHED_VARIABLE_ISSUE): New macro.
---
  gcc/config/riscv/riscv.cc | 21 +
  1 file changed, 21 insertions(+)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 3954fc07a8b..559fa9cd7e0 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -6225,6 +6225,24 @@ riscv_issue_rate (void)
    return tune_param->issue_rate;
  }
+/* Implement TARGET_SCHED_VARIABLE_ISSUE.  */
+
+static int
+riscv_sched_variable_issue (FILE *, int, rtx_insn *insn, int more)
+{
+  if (DEBUG_INSN_P (insn))
+    return more;
+
+  rtx_code code = GET_CODE (PATTERN (insn));
+  if (code == USE || code == CLOBBER)
+    return more;
+
+  if (get_attr_type (insn) == TYPE_UNKNOWN)
+    return more;
+
+  return more - 1;
+}
The problem is that INSN is *much* more likely to be a real instruction 
that takes real resources, even if it is TYPE_UNKNOWN.
TYPE_UNKNOWN here is actually an indicator of what I would consider a 
bug in the backend, specifically that we have INSNs that do not provide 
a mapping for the schedulers to suitable types.


With that in mind I'd much rather get to the point where we can do 
something like this for TYPE_UNKNOWN:


type = get_attr_type (insn);
gcc_assert (type != TYPE_UNKNOWN);

That way if we ever encounter a TYPE_UNKNOWN during development, we can 
fix it in the md files in a sensible manner.  I don't know if we are 
close to being able to do that.  We fixed a ton of stuff in bitmanip.md, 
but I don't think there's been a thorough review of the port to find 
other instances of TYPE_UNKNOWN INSNs.



The other thing if this code probably wants to handle GHOST type 
instructions.  While GHOST is used for instructions which generate no 
code, it might seem they should return "more" as those INSNs take no 
resources.  But GHOST is actually used for things like the blockage insn 
which should end a cycle from an issue standpoint.  So the right 
handling of a GHOST is something like this:


if (type == TYPE_GHOST)
   return 0;
So there wasn't ever any follow-up.  Given this was something Ventana 
was also carrying locally (with very minor differences) I went ahead and 
merged up the implementations and pushed the final result to the trunk.



Attached is the patch that was actually committed.

Jeffcommit f088b768d01ae42385697584a2bcac141685dce2
Author: Jin Ma 
Date:   Wed Aug 9 13:52:06 2023 -0600

RISC-V: Handle no_insn in TARGET_SCHED_VARIABLE_ISSUE.

Reference: 
https://github.com/gcc-mirror/gcc/commit/d0bc0cb66bcb0e6a5a5a31a9e900e8ccc98e34e5

RISC-V should also be implemented to handle no_insn patterns for pipelining.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_sched_variable_issue): New function.
(TARGET_SCHED_VARIABLE_ISSUE): New macro.

Co-authored-by: Philipp Tomsich 
Co-authored-by: Jeff Law 

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 7f2041a54ba..dfb519ab9a8 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -6698,6 +6698,31 @@ riscv_issue_rate (void)
   return tune_param->issue_rate;
 }
 
+/* Implement TARGET_SCHED_VARIABLE_ISSUE.  */
+static int
+riscv_sched_variable_issue (FILE *, int, rtx_insn *insn, int more)
+{
+  if (DEBUG_INSN_P (insn))
+return more;
+
+  rtx_code code = GET_CODE (PATTERN (insn));
+  if (code == USE || code == CLOBBER)
+return more;
+
+  /* GHOST insns are used for blockage and similar cases which
+ effectively end a cycle.  */
+  if (get_attr_type (insn) == TYPE_GHOST)
+return 0;
+
+#if 0
+  /* If we ever encounter an insn with an unknown type, trip
+ an assert so we can find and fix this problem.  */
+  gcc_assert (get_attr_type (insn) != TYPE_UNKNOWN);
+#endif
+
+  return more - 1;
+}
+
 /* Auxiliary function to emit RISC-V ELF attribute. */
 static void
 riscv_emit_attribute ()
@@ -8420,6 +8445,9 @@ riscv_frame_pointer_required (void)
 #undef TARGET_SCHED_ISSUE_RATE
 #define TARGET_SCHED_ISSUE_RATE riscv_issue_rate
 
+#undef  TARGET_SCHED_VARIABLE_ISSUE
+#define TARGET_SCHED_VARIABLE_ISSUE riscv_sched_variable_issue
+
 #undef TARGET_FUNCTION_OK_FOR_SIBCALL
 #define TARGET_FUNCTION_OK_FOR_SIBCALL riscv_function_ok_for_sibcall
 


[Bug tree-optimization/110963] [14 Regression] Dead Code Elimination Regression since r14-2946-g46c8c225455

2023-08-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110963

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2023-08-09
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
Confirmed.
PRE seems to have missed an obvious redundant load here though:
The IR after PRE:
```
  pretmp_12 = h;
  if (f.2_9 != 0)
goto ; [50.00%]
  else
goto ; [50.00%]
...
   [local count: 536870912]:
  if (pretmp_12 == 0B)
goto ; [30.00%]
  else
goto ; [70.00%]

   [local count: 375809640]:
  h = 0B;

   [local count: 536870912]:
  g = 0;
  h.6_15 = h;
```

The IR before PRE:
```
   [local count: 1073741824]:
  # VUSE <.MEM_2(D)>
  f.2_9 = f;
  if (f.2_9 != 0)
goto ; [50.00%]
  else
goto ; [50.00%]

   [local count: 536870912]:
  # VUSE <.MEM_2(D)>
  _11 = h;
  if (_11 != 0B)
goto ; [0.00%]
  else
goto ; [100.00%]
...
   [local count: 536870912]:
  # VUSE <.MEM_2(D)>
  _13 = h;
  if (_13 == 0B)
goto ; [30.00%]
  else
goto ; [70.00%]

   [local count: 375809640]:
  # .MEM_20 = VDEF <.MEM_2(D)>
  h = 0B;

   [local count: 536870912]:
  # .MEM_16 = PHI <.MEM_2(D)(5), .MEM_20(6)>
  # .MEM_21 = VDEF <.MEM_16>
  g = 0;
  # VUSE <.MEM_21>
  h.6_15 = h;
```

[Bug tree-optimization/110963] [14 Regression] Dead Code Elimination Regression since r14-2946-g46c8c225455

2023-08-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110963

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |14.0
 CC||pinskia at gcc dot gnu.org
   Keywords||missed-optimization

[Bug tree-optimization/110963] New: [14 Regression] Dead Code Elimination Regression since r14-2946-g46c8c225455

2023-08-09 Thread theodort at inf dot ethz.ch via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110963

Bug ID: 110963
   Summary: [14 Regression] Dead Code Elimination Regression since
r14-2946-g46c8c225455
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: theodort at inf dot ethz.ch
  Target Milestone: ---

https://godbolt.org/z/devdMf1qd

Given the following code:

void foo(void);
static int c = 76, f, g;
static int *h, *j, *k = 
static int **i = 
static short a;
static char(l)(char b) {
if (!(((b) >= 77) && ((b) <= 77))) {
__builtin_unreachable();
}
return 0;
}
static short(m)(short d, short e) { return d + e; }
static short n(char) {
j = *i;
if (j == 0)
;
else
*i = 0;
*k = 0;
return 0;
}
static char o() {
l(0);
return 0;
}
static char p(int ad) {
a = m(!0, ad);
l(a);
if (f) {
*i &(o());
*i = 0;
} else
n(0);
if (h ==  || h == 0)
;
else
foo();
return 0;
}
int main() {
p(c);
c = 8;
}

gcc-trunk -O3 does not eliminate the call to foo:

main:
movlf(%rip), %edi
testl   %edi, %edi
je  .L11
.L7:
movl$8, c(%rip)
xorl%eax, %eax
ret
.L11:
cmpq$0, h(%rip)
je  .L4
xorl%esi, %esi
movq%rsi, h(%rip)
.L4:
movqh(%rip), %rax
xorl%ecx, %ecx
movl%ecx, g(%rip)
testq   %rax, %rax
je  .L7
cmpq$f, %rax
je  .L7
pushq   %rax
callfoo
xorl%eax, %eax
movl$8, c(%rip)
popq%rdx
ret

gcc-13.2.0 -O3 eliminates the call to foo:

main:
movlf(%rip), %ecx
movqh(%rip), %rax
testl   %ecx, %ecx
jne .L3
testq   %rax, %rax
je  .L4
xorl%edx, %edx
movq%rdx, h(%rip)
.L4:
xorl%eax, %eax
movl%eax, g(%rip)
.L3:
movl$8, c(%rip)
xorl%eax, %eax
ret

Bisects to r14-2946-g46c8c225455

[Bug middle-end/110954] [14 Regression] Wrong code with -O0

2023-08-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110954

--- Comment #3 from Andrew Pinski  ---
Cleaned and simplified up testcase:
```
#define comparison (f < 0)
int main() {
  int f = 0;
  int d = comparison | !comparison;
  if (d != 1)
__builtin_abort();
}

```

Re: RISC-V: Folding memory for FP + constant case

2023-08-09 Thread Jeff Law via Gcc-patches



On 7/12/23 14:59, Jivan Hakobyan via Gcc-patches wrote:

Accessing local arrays element turned into load form (fp + (index << C1)) +
C2 address.
In the case when access is in the loop we got loop invariant computation.
For some reason, moving out that part cannot be done in
loop-invariant passes.
But we can handle that in target-specific hook (legitimize_address).
That provides an opportunity to rewrite memory access more suitable for the
target architecture.

This patch solves the mentioned case by rewriting mentioned case to ((fp +
C2) + (index << C1))
I have evaluated it on SPEC2017 and got an improvement on leela (over 7b
instructions,
.39% of the dynamic count) and dwarfs the regression for gcc (14m
instructions, .0012%
of the dynamic count).


gcc/ChangeLog:
 * config/riscv/riscv.cc (riscv_legitimize_address): Handle folding.
 (mem_shadd_or_shadd_rtx_p): New predicate.

So I poked a bit more in this space today.

As you may have noted, Manolis's patch still needs another rev.  But I 
was able to test this patch in conjunction with the f-m-o patch as well 
as the additional improvements made to hard register cprop.  The net 
result was that this patch still shows a nice decrease in instruction 
counts on leela.  It's a bit of a mixed bag elsewhere.


I dove a bit deeper into the small regression in x264.  In the case I 
looked at the reason the patch regresses is the original form of the 
address calculations exposes a common subexpression ie


addr1 = (reg1 << 2) + fp + C1
addr2 = (reg1 << 2) + fp + C2

(reg1 << 2) + fp is a common subexpression resulting in something like 
this as we leave CSE:


t = (reg1 << 2) + fp;
addr1 = t + C1
addr2 = t + C2
mem (addr1)
mem (addr2)

C1 and C2 are small constants, so combine generates

t = (reg1 << 2) + fp;
mem (t+C1)
mem (t+C2)

FP elimination occurs after IRA and we get:

t2 = sp + C3
t = (reg << 2) + t2
mem (t + C1)
mem (t + C2)


Not bad.  Manolis's work should allow us to improve that a bit more.


With this patch we don't capture the CSE and ultimately generate 
slightly worse code.  This kind of issue is fairly inherent in 
reassociations -- and given the regression is 2 orders of magnitude 
smaller than the improvement my inclination is to go forward with this 
patch.




I've fixed a few formatting issues and changed once conditional to use 
CONST_INT_P rather than checking the code directory and pushed the final 
version to the trunk.


Thanks for your patience.

jeff
commit a16dc729fda9fabd6472d50cce45791cb3b6ada8
Author: Jivan Hakobyan 
Date:   Wed Aug 9 13:26:58 2023 -0600

RISC-V: Folding memory for FP + constant case

Accessing local arrays element turned into load form (fp + (index << C1)) +
C2 address.

In the case when access is in the loop we got loop invariant computation.  
For
some reason, moving out that part cannot be done in loop-invariant passes.  
But
we can handle that in target-specific hook (legitimize_address).  That 
provides
an opportunity to rewrite memory access more suitable for the target
architecture.

This patch solves the mentioned case by rewriting mentioned case to ((fp +
C2) + (index << C1))

I have evaluated it on SPEC2017 and got an improvement on leela (over 7b
instructions, .39% of the dynamic count) and dwarfs the regression for gcc 
(14m
instructions, .0012% of the dynamic count).

gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_legitimize_address): Handle folding.
(mem_shadd_or_shadd_rtx_p): New function.

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 77892da2920..7f2041a54ba 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -1805,6 +1805,22 @@ riscv_shorten_lw_offset (rtx base, HOST_WIDE_INT offset)
   return addr;
 }
 
+/* Helper for riscv_legitimize_address. Given X, return true if it
+   is a left shift by 1, 2 or 3 positions or a multiply by 2, 4 or 8.
+
+   This respectively represent canonical shift-add rtxs or scaled
+   memory addresses.  */
+static bool
+mem_shadd_or_shadd_rtx_p (rtx x)
+{
+  return ((GET_CODE (x) == ASHIFT
+  || GET_CODE (x) == MULT)
+ && CONST_INT_P (XEXP (x, 1))
+ && ((GET_CODE (x) == ASHIFT && IN_RANGE (INTVAL (XEXP (x, 1)), 1, 3))
+ || (GET_CODE (x) == MULT
+ && IN_RANGE (exact_log2 (INTVAL (XEXP (x, 1))), 1, 3;
+}
+
 /* This function is used to implement LEGITIMIZE_ADDRESS.  If X can
be legitimized in a way that the generic machinery might not expect,
return a new address, otherwise return NULL.  MODE is the mode of
@@ -1830,6 +1846,32 @@ riscv_legitimize_address (rtx x, rtx oldx 
ATTRIBUTE_UNUSED,
   rtx base = XEXP (x, 0);
   HOST_WIDE_INT offset = INTVAL (XEXP (x, 1));
 
+  /* Handle (plus (plus (mult (a) (mem_shadd_constant)) (fp)) (C)) case.  
*/
+  if (GET_CODE (base) == PLUS && mem_shadd_or_shadd_rtx_p (XEXP (base, 0))
+ 

[Bug middle-end/100798] a?~t:t and (-(!!a))^t don't produce the same assembly code

2023-08-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100798

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED
   Target Milestone|--- |14.0

--- Comment #5 from Andrew Pinski  ---
Fixed.

[Bug tree-optimization/110937] (bool0 ? bool1^1 : bool1) is not optimized to bool0 ^ bool1

2023-08-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110937

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |14.0
 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #4 from Andrew Pinski  ---
Fixed.

[PATCH v2] analyzer: More features for CPython analyzer plugin [PR107646]

2023-08-09 Thread Eric Feng via Gcc-patches
Thank you for your help in getting dg-require-python-h working! I can
confirm that the FAILs are related to differences between the --cflags
affecting the gimple seen by the analyzer. For this reason, I have
changed it to --includes for now. To be sure, I tested on Python 3.8 as
well and it works as expected. I have also addressed the following
comments on the WIP patch as you described.

-- Update Changelog entry to list new functions being simulated.
-- Update region_model::get_or_create_region_for_heap_alloc leading
comment.
-- Change register_alloc to update_state_machine.
-- Change move_ptr_sval_non_null to transition_ptr_sval_non_null.
-- Static helper functions for:
-- Initializing ob_refcnt field.
-- Setting ob_type field.
-- Getting ob_base field.
-- Initializing heap allocated region for PyObjects.
-- Incrementing a field by one.
-- Change arg_is_long_p to arg_is_integral_p.
-- Extract common failure scenario for reusability.

The initial WIP patch using 

/* { dg-options "-fanalyzer -I/usr/include/python3.9" }. */

have been bootstrapped and regtested on aarch64-unknown-linux-gnu. Since
we did not change any core logic in the revision and the only changes
within the analyzer core are changing variable names, is it OK for
trunk. In the mean time, the revised patch is currently going through
bootstrap and regtest process.

Best,
Eric

---
This patch adds known function subclasses for Python/C API functions
PyList_New, PyLong_FromLong, and PyList_Append. It also adds new
optional parameters for
region_model::get_or_create_region_for_heap_alloc, allowing for the
newly allocated region to immediately transition from the start state to
the assumed non-null state in the malloc state machine if desired.
Finally, it adds a new procedure, dg-require-python-h, intended as a
directive in Python-related analyzer tests, to append necessary Python
flags during the tests' build process.

The main warnings we gain in this patch with respect to the known function
subclasses mentioned are leak related. For example:

rc3.c: In function ‘create_py_object’:
│
rc3.c:21:10: warning: leak of ‘item’ [CWE-401] [-Wanalyzer-malloc-leak]
│
   21 |   return list;
  │
  |  ^~~~
│
  ‘create_py_object’: events 1-4
│
|
│
|4 |   PyObject* item = PyLong_FromLong(10);
│
|  |^~~
│
|  ||
│
|  |(1) allocated here
│
|  |(2) when ‘PyLong_FromLong’ succeeds
│
|5 |   PyObject* list = PyList_New(2);
│
|  |~
│
|  ||
│
|  |(3) when ‘PyList_New’ fails
│
|..
│
|   21 |   return list;
│
|  |  
│
|  |  |
│
|  |  (4) ‘item’ leaks here; was allocated at (1)
│

Some concessions were made to
simplify the analysis process when comparing kf_PyList_Append with the
real implementation. In particular, PyList_Append performs some
optimization internally to try and avoid calls to realloc if
possible. For simplicity, we assume that realloc is called every time.
Also, we grow the size by just 1 (to ensure enough space for adding a
new element) rather than abide by the heuristics that the actual implementation
follows.

gcc/analyzer/ChangeLog:
PR analyzer/107646
* region-model.cc (region_model::get_or_create_region_for_heap_alloc):
New optional parameters.
* region-model.h (class region_model): New optional parameters.
* sm-malloc.cc (on_realloc_with_move): New function.
(region_model::transition_ptr_sval_non_null): New function.

gcc/testsuite/ChangeLog:
PR analyzer/107646
* gcc.dg/plugin/analyzer_cpython_plugin.c: Analyzer support for
PyList_New, PyList_Append, PyLong_FromLong
* gcc.dg/plugin/plugin.exp: New test.
* lib/target-supports.exp: New procedure.
* gcc.dg/plugin/cpython-plugin-test-2.c: New test.

Signed-off-by: Eric Feng 
---
 gcc/analyzer/region-model.cc  |  20 +-
 gcc/analyzer/region-model.h   |  10 +-
 gcc/analyzer/sm-malloc.cc |  40 +
 .../gcc.dg/plugin/analyzer_cpython_plugin.c   | 711 ++
 .../gcc.dg/plugin/cpython-plugin-test-2.c |  78 ++
 gcc/testsuite/gcc.dg/plugin/plugin.exp|   3 +-
 gcc/testsuite/lib/target-supports.exp |  25 +
 7 files changed, 881 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-2.c

diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index e92b3f7b074..c338f045d92 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -5127,11 +5127,16 @@ region_model::check_dynamic_size_for_floats (const 
svalue *size_in_bytes,
Use CTXT to complain about tainted sizes.
 
Reuse an existing 

[PATCH v2] analyzer: More features for CPython analyzer plugin [PR107646]

2023-08-09 Thread Eric Feng via Gcc
Thank you for your help in getting dg-require-python-h working! I can
confirm that the FAILs are related to differences between the --cflags
affecting the gimple seen by the analyzer. For this reason, I have
changed it to --includes for now. To be sure, I tested on Python 3.8 as
well and it works as expected. I have also addressed the following
comments on the WIP patch as you described.

-- Update Changelog entry to list new functions being simulated.
-- Update region_model::get_or_create_region_for_heap_alloc leading
comment.
-- Change register_alloc to update_state_machine.
-- Change move_ptr_sval_non_null to transition_ptr_sval_non_null.
-- Static helper functions for:
-- Initializing ob_refcnt field.
-- Setting ob_type field.
-- Getting ob_base field.
-- Initializing heap allocated region for PyObjects.
-- Incrementing a field by one.
-- Change arg_is_long_p to arg_is_integral_p.
-- Extract common failure scenario for reusability.

The initial WIP patch using 

/* { dg-options "-fanalyzer -I/usr/include/python3.9" }. */

have been bootstrapped and regtested on aarch64-unknown-linux-gnu. Since
we did not change any core logic in the revision and the only changes
within the analyzer core are changing variable names, is it OK for
trunk. In the mean time, the revised patch is currently going through
bootstrap and regtest process.

Best,
Eric

---
This patch adds known function subclasses for Python/C API functions
PyList_New, PyLong_FromLong, and PyList_Append. It also adds new
optional parameters for
region_model::get_or_create_region_for_heap_alloc, allowing for the
newly allocated region to immediately transition from the start state to
the assumed non-null state in the malloc state machine if desired.
Finally, it adds a new procedure, dg-require-python-h, intended as a
directive in Python-related analyzer tests, to append necessary Python
flags during the tests' build process.

The main warnings we gain in this patch with respect to the known function
subclasses mentioned are leak related. For example:

rc3.c: In function ‘create_py_object’:
│
rc3.c:21:10: warning: leak of ‘item’ [CWE-401] [-Wanalyzer-malloc-leak]
│
   21 |   return list;
  │
  |  ^~~~
│
  ‘create_py_object’: events 1-4
│
|
│
|4 |   PyObject* item = PyLong_FromLong(10);
│
|  |^~~
│
|  ||
│
|  |(1) allocated here
│
|  |(2) when ‘PyLong_FromLong’ succeeds
│
|5 |   PyObject* list = PyList_New(2);
│
|  |~
│
|  ||
│
|  |(3) when ‘PyList_New’ fails
│
|..
│
|   21 |   return list;
│
|  |  
│
|  |  |
│
|  |  (4) ‘item’ leaks here; was allocated at (1)
│

Some concessions were made to
simplify the analysis process when comparing kf_PyList_Append with the
real implementation. In particular, PyList_Append performs some
optimization internally to try and avoid calls to realloc if
possible. For simplicity, we assume that realloc is called every time.
Also, we grow the size by just 1 (to ensure enough space for adding a
new element) rather than abide by the heuristics that the actual implementation
follows.

gcc/analyzer/ChangeLog:
PR analyzer/107646
* region-model.cc (region_model::get_or_create_region_for_heap_alloc):
New optional parameters.
* region-model.h (class region_model): New optional parameters.
* sm-malloc.cc (on_realloc_with_move): New function.
(region_model::transition_ptr_sval_non_null): New function.

gcc/testsuite/ChangeLog:
PR analyzer/107646
* gcc.dg/plugin/analyzer_cpython_plugin.c: Analyzer support for
PyList_New, PyList_Append, PyLong_FromLong
* gcc.dg/plugin/plugin.exp: New test.
* lib/target-supports.exp: New procedure.
* gcc.dg/plugin/cpython-plugin-test-2.c: New test.

Signed-off-by: Eric Feng 
---
 gcc/analyzer/region-model.cc  |  20 +-
 gcc/analyzer/region-model.h   |  10 +-
 gcc/analyzer/sm-malloc.cc |  40 +
 .../gcc.dg/plugin/analyzer_cpython_plugin.c   | 711 ++
 .../gcc.dg/plugin/cpython-plugin-test-2.c |  78 ++
 gcc/testsuite/gcc.dg/plugin/plugin.exp|   3 +-
 gcc/testsuite/lib/target-supports.exp |  25 +
 7 files changed, 881 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-2.c

diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index e92b3f7b074..c338f045d92 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -5127,11 +5127,16 @@ region_model::check_dynamic_size_for_floats (const 
svalue *size_in_bytes,
Use CTXT to complain about tainted sizes.
 
Reuse an existing 

[Bug tree-optimization/110937] (bool0 ? bool1^1 : bool1) is not optimized to bool0 ^ bool1

2023-08-09 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110937

--- Comment #3 from CVS Commits  ---
The trunk branch has been updated by Andrew Pinski :

https://gcc.gnu.org/g:7fb65f102851248bafa0815401d8bdcea6d7626c

commit r14-3110-g7fb65f102851248bafa0815401d8bdcea6d7626c
Author: Andrew Pinski 
Date:   Mon Aug 7 10:47:09 2023 -0700

MATCH: [PR110937/PR100798] (a ? ~b : b) should be optimized to b ^ -(a)

This adds a simple match pattern for this case.
I noticed it a couple of different places.
One while I was looking at code generation of a parser and
also while I was looking at locations where bitwise_inverted_equal_p
should be used more.

Committed as approved after bootstrapped and tested on x86_64-linux-gnu
with no regressions.

PR tree-optimization/110937
PR tree-optimization/100798

gcc/ChangeLog:

* match.pd (`a ? ~b : b`): Handle this
case.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/bool-14.c: New test.
* gcc.dg/tree-ssa/bool-15.c: New test.
* gcc.dg/tree-ssa/phi-opt-33.c: New test.
* gcc.dg/tree-ssa/20030709-2.c: Update testcase
so `a ? -1 : 0` is not used to hit the match
pattern.

[Bug middle-end/100798] a?~t:t and (-(!!a))^t don't produce the same assembly code

2023-08-09 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100798

--- Comment #4 from CVS Commits  ---
The trunk branch has been updated by Andrew Pinski :

https://gcc.gnu.org/g:7fb65f102851248bafa0815401d8bdcea6d7626c

commit r14-3110-g7fb65f102851248bafa0815401d8bdcea6d7626c
Author: Andrew Pinski 
Date:   Mon Aug 7 10:47:09 2023 -0700

MATCH: [PR110937/PR100798] (a ? ~b : b) should be optimized to b ^ -(a)

This adds a simple match pattern for this case.
I noticed it a couple of different places.
One while I was looking at code generation of a parser and
also while I was looking at locations where bitwise_inverted_equal_p
should be used more.

Committed as approved after bootstrapped and tested on x86_64-linux-gnu
with no regressions.

PR tree-optimization/110937
PR tree-optimization/100798

gcc/ChangeLog:

* match.pd (`a ? ~b : b`): Handle this
case.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/bool-14.c: New test.
* gcc.dg/tree-ssa/bool-15.c: New test.
* gcc.dg/tree-ssa/phi-opt-33.c: New test.
* gcc.dg/tree-ssa/20030709-2.c: Update testcase
so `a ? -1 : 0` is not used to hit the match
pattern.

[PATCH] MATCH: [PR110937/PR100798] (a ? ~b : b) should be optimized to b ^ -(a)

2023-08-09 Thread Andrew Pinski via Gcc-patches
This adds a simple match pattern for this case.
I noticed it a couple of different places.
One while I was looking at code generation of a parser and
also while I was looking at locations where bitwise_inverted_equal_p
should be used more.

Committed as approved after bootstrapped and tested on x86_64-linux-gnu with no 
regressions.

PR tree-optimization/110937
PR tree-optimization/100798

gcc/ChangeLog:

* match.pd (`a ? ~b : b`): Handle this
case.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/bool-14.c: New test.
* gcc.dg/tree-ssa/bool-15.c: New test.
* gcc.dg/tree-ssa/phi-opt-33.c: New test.
* gcc.dg/tree-ssa/20030709-2.c: Update testcase
so `a ? -1 : 0` is not used to hit the match
pattern.
---
 gcc/match.pd   | 14 ++
 gcc/testsuite/gcc.dg/tree-ssa/20030709-2.c |  5 +++--
 gcc/testsuite/gcc.dg/tree-ssa/bool-14.c| 15 +++
 gcc/testsuite/gcc.dg/tree-ssa/bool-15.c| 18 ++
 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-33.c | 13 +
 5 files changed, 63 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/bool-14.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/bool-15.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-33.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 9b4819e5be7..fc630b63563 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -6460,6 +6460,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (if (cmp == NE_EXPR)
{ constant_boolean_node (true, type); })))
 
+#if GIMPLE
+/* a?~t:t -> (-(a))^t */
+(simplify
+ (cond @0 @1 @2)
+ (if (INTEGRAL_TYPE_P (type)
+  && bitwise_inverted_equal_p (@1, @2))
+  (with {
+auto prec = TYPE_PRECISION (type);
+auto unsign = TYPE_UNSIGNED (type);
+tree inttype = build_nonstandard_integer_type (prec, unsign);
+   }
+   (convert (bit_xor (negate (convert:inttype @0)) (convert:inttype @2))
+#endif
+
 /* Simplify pointer equality compares using PTA.  */
 (for neeq (ne eq)
  (simplify
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20030709-2.c 
b/gcc/testsuite/gcc.dg/tree-ssa/20030709-2.c
index 5009cd69cfe..78938f919d4 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/20030709-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/20030709-2.c
@@ -29,15 +29,16 @@ union tree_node
 };
 int make_decl_rtl (tree, int);
 void *
-get_alias_set (t)
+get_alias_set (t, t1)
  tree t;
+ void *t1;
 {
   long set;
   if (t->decl.rtl)
 return (t->decl.rtl->fld[1].rtmem 
? 0
: (((t->decl.rtl ? t->decl.rtl: (make_decl_rtl (t, 0), 
t->decl.rtl)))->fld[1]).rtmem);
-  return (void*)-1;
+  return t1;
 }
 
 /* There should be precisely one load of ->decl.rtl.  If there is
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bool-14.c 
b/gcc/testsuite/gcc.dg/tree-ssa/bool-14.c
new file mode 100644
index 000..0149380a63b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/bool-14.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized-raw" } */
+/* PR tree-optimization/110937 */
+
+_Bool f2(_Bool a, _Bool b)
+{
+if (a)
+  return !b;
+return b;
+}
+
+/* We should be able to remove the conditional and convert it to an xor. */
+/* { dg-final { scan-tree-dump-not "gimple_cond " "optimized" } } */
+/* { dg-final { scan-tree-dump-not "gimple_phi " "optimized" } } */
+/* { dg-final { scan-tree-dump-times "bit_xor_expr, " 1 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bool-15.c 
b/gcc/testsuite/gcc.dg/tree-ssa/bool-15.c
new file mode 100644
index 000..1f496663863
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/bool-15.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized-raw" } */
+/* PR tree-optimization/110937 */
+
+_Bool f2(int x, int y, int w, int z)
+{
+  _Bool a = x == y;
+  _Bool b = w == z;
+  if (a)
+return !b;
+  return b;
+}
+
+/* We should be able to remove the conditional and convert it to an xor. */
+/* { dg-final { scan-tree-dump-not "gimple_cond " "optimized" } } */
+/* { dg-final { scan-tree-dump-not "gimple_phi " "optimized" } } */
+/* { dg-final { scan-tree-dump-not "ne_expr, " "optimized" } } */
+/* { dg-final { scan-tree-dump-times "bit_xor_expr, " 1 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-33.c 
b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-33.c
new file mode 100644
index 000..b79fe44187a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-33.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized-raw" } */
+/* PR tree-optimization/100798 */
+
+int f(int a, int t)
+{
+  return (a=='s' ? ~t : t);
+}
+
+/* This should be convert into t^-(a=='s').  */
+/* { dg-final { scan-tree-dump-times "bit_xor_expr, " 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "negate_expr, " 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-not "bit_not_expr, " "optimized" } } 

Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-08-09 Thread Kees Cook via Gcc-patches
On Mon, Aug 07, 2023 at 04:33:13PM +, Qing Zhao wrote:
> What’s the testing case for the one that failed? 
> If it’s 
> 
> __builtin_dynamic_object_size(p->array, 0/2) without the allocation 
> information in the routine, 
> then with the current algorithm, gcc cannot deduce the size for the whole 
> object.
> 
> If not such case, let me know.

I found some more bugs in my tests (now fixed), but I'm left with a single
failure case, which is think it going to boil down to pointer/pointee
issue we discussed earlier.

Using my existing testing tool:
https://github.com/kees/kernel-tools/blob/trunk/fortify/array-bounds.c

I see this error with the "counted_by_seen_by_bdos" case:

Expected __builtin_dynamic_object_size(p, 1) (18446744073709551615) == 
sizeof(*p) + p->count * sizeof(*p->array) (80)

A reduced view of the code is:

struct annotated *p;
int index = MAX_INDEX + unconst;

p = alloc_annotated(index);

EXPECT_EQ(__builtin_dynamic_object_size(p, 1), sizeof(*p) + p->count * 
sizeof(*p->array));

It looks like __bdos will not use the __counted_by information from the
pointee if the argument is only the pointer. i.e. this test works:

EXPECT_EQ(__builtin_dynamic_object_size(p->array, 1), p->count * 
sizeof(*p->array));

However, I thought if any part of the pointee was used (e.g. p->count,
p->array), GCC would be happy to start using the pointee details?

And, again, for the initial version at this feature, I'm fine with this
failing test being declared not a valid test. :) But I'll need some
kind of builtin that can correctly interrogate a pointer to find the
full runtime size with the assumption that pointer is valid, but that
can come later.

And as a side note, I am excited to see the very correct warnings for
bad (too late) assignment of the __counted_by member:

p->array[0] = 0;
p->count = 1;

array-bounds.c: In function 'invalid_assignment_order':
array-bounds.c:366:17: warning: '*p.count' is used uninitialized 
[-Wuninitialized]
  366 | p->array[0] = 0;
  | ^~~

Yay! :)

-Kees

-- 
Kees Cook


[Bug middle-end/110962] New: RISC-V vector Fortran ICE in expand_expr_real_2

2023-08-09 Thread jeremy.bennett at embecosm dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110962

Bug ID: 110962
   Summary: RISC-V vector Fortran ICE in expand_expr_real_2
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jeremy.bennett at embecosm dot com
  Target Milestone: ---

Created attachment 55712
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55712=edit
Fortran source for test case

The following code (testcase.f90) causes an ICE when using RISC-V vector as
target.

SUBROUTINE a(b,c,d)
  LOGICAL,DIMENSION(INOUT)  :: b
  LOGICAL e
  REAL, DIMENSION(IN) ::  c
  REAL, DIMENSION(INOUT)  ::  d
  REAL, DIMENSION(SIZE(c))   :: f
  WHERE (b.AND.e)
 WHERE (f>=0.)
d = g
 ENDWHERE
  ENDWHERE
END SUBROUTINE a

Compiled with:

riscv64-unknown-linux-gnu-gfortran -march=rv64gcv -mabi=lp64d \
   -c -Ofast --param=riscv-autovec-preference=scalable testcase.f90

Output is:

during RTL pass: expand
testcase.f90:1:12:

1 | SUBROUTINE a(b,c,d)
  |^
internal compiler error: in expand_expr_real_2, at expr.cc:10566
0x830f6d expand_expr_real_2(separate_ops*, rtx_def*, machine_mode,
expand_modifier)
/home/jeremy/gittrees/mustang/gcc/gcc/expr.cc:10566
0xc04ab0 expand_gimple_stmt_1
/home/jeremy/gittrees/mustang/gcc/gcc/cfgexpand.cc:3983
0xc04ab0 expand_gimple_stmt
/home/jeremy/gittrees/mustang/gcc/gcc/cfgexpand.cc:4044
0xc097c6 expand_gimple_basic_block
/home/jeremy/gittrees/mustang/gcc/gcc/cfgexpand.cc:6096
0xc0b176 execute
/home/jeremy/gittrees/mustang/gcc/gcc/cfgexpand.cc:6831
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

System information
--

Using built-in specs.
COLLECT_GCC=riscv64-unknown-linux-gnu-gfortran
COLLECT_LTO_WRAPPER=/home/jeremy/gittrees/mustang/install/libexec/gcc/riscv64-unknown-linux-gnu/14.0.0/lto-wrapper
Target: riscv64-unknown-linux-gnu
Configured with: /home/jeremy/gittrees/mustang/gcc/configure
--target=riscv64-unknown-linux-gnu
--prefix=/home/jeremy/gittrees/mustang/install
--with-sysroot=/home/jeremy/gittrees/mustang/install/sysroot
--with-pkgversion=g5c27c911f6b --with-system-zlib --enable-shared --enable-tls
--enable-languages=c,c++,fortran --disable-libmudflap --disable-libssp
--disable-libquadmath --disable-libsanitizer --disable-nls --disable-bootstrap
--src=/home/jeremy/gittrees/mustang/gcc --enable-multilib --with-abi=lp64d
--with-arch=rv64gc --with-tune= --with-isa-spec=20191213 'CFLAGS_FOR_TARGET=-O2
   -mcmodel=medany' 'CXXFLAGS_FOR_TARGET=-O2-mcmodel=medany'
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 14.0.0 20230809 (experimental) (g5c27c911f6b)

  1   2   3   >