Re: GSoC project idea: Separate Host Process Offloading

2023-02-07 Thread Tobias Burnus

Hi Thomas, hi all,

On 08.02.23 00:26, Thomas Schwinge wrote:

On 2023-02-02T22:13:20+0100, I wrote:

I'd offer to co-mentor, but I'd rather not be the only one.

Still looking for someone to join, please!  :-)

As kind of mentioned before, I am willing to co-mentor :-)

Here it is:
https://gcc.gnu.org/wiki/SummerOfCode#Separate_Host_Process_Offloading
Please have a look, and fix up if necessary.


Thanks. (Disclaimer: I still have to read it.)

I think this could be also the basis for remote OpenMP offload handling
via the to-be-created target-specific plugin. In this case, to do
offloading to the remote CPU. Writing such an RPC wrapper would be an
independent task, which then should also handle the offloading to a
remote GPU.  — Citing an LLVM paper about such a feature:

"The basic idea behind the Remote OpenMP Offloading implementation is to
provide a transparent communication channel between the
target-independent library on the host with the target-dependent library
on the remote system. This use case matches the well-known remote
procedure calls (RPC) idiom. To facilitate communication, we added two
new components into LLVM/OpenMP that build a tunnel from the host to the
remote system through which all plugin API calls ( in Fig. 3) are
forwarded. The first is a remote offloading plugin () which presents
itself to the host as any other plugin would, i.e., it looks no
different than the NVIDIA GPU offloading plugin. The second is a server
application that must be running on the remote system () that mimics
when it communicates with the remote device plugin, such as the one for
NVIDIA GPU offloading. The bottom row of Fig. 3 illustrates their
interaction with the existing infrastructure."

(From https://link.springer.com/chapter/10.1007/978-3-031-07312-0_16
(requires subscription))

About this topic, see also
https://www.hpcwire.com/off-the-wire/remote-openmp-offloading-paper-recognized-at-isc/
and https://baodishan.com/assets/pdf/iwomp22.pdf (both freely
accessible, the latter has a lot of details).

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


[PATCH, rs6000] Split TImode for logical operations in expand pass [PR100694]

2023-02-07 Thread HAO CHEN GUI via Gcc-patches
Hi,
  The logical operations for TImode is split after reload pass right now. Some
potential optimizations miss as the split is too late. This patch removes
TImode from "AND", "IOR", "XOR" and "NOT" expander so that these logical
operations can be split at expand pass. The new test case illustrates the
optimization.

  Two test cases of pr92398 are merged into one as all sub-targets generates
the same sequence of instructions with the patch.

  Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.

Thanks
Gui Haochen


ChangeLog
2023-02-08  Haochen Gui 

gcc/
PR target/100694
* config/rs6000/rs6000.md (BOOL_128_V): New mode iterator for 128-bit
vector types.
(and3): Replace BOOL_128 with BOOL_128_V.
(ior3): Likewise.
(xor3): Likewise.
(one_cmpl2 expander): New expander with BOOL_128_V.
(one_cmpl2 insn_and_split): Rename to ...
(*one_cmpl2): ... this.

gcc/testsuite/
PR target/100694
* gcc.target/powerpc/pr100694.c: New.
* gcc.target/powerpc/pr92398.c: New.
* gcc.target/powerpc/pr92398.h: Remove.
* gcc.target/powerpc/pr92398.p9-.c: Remove.
* gcc.target/powerpc/pr92398.p9+.c: Remove.


patch.diff
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 4bd1dfd3da9..455b7329643 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -743,6 +743,15 @@ (define_mode_iterator BOOL_128 [TI
 (V2DF  "TARGET_ALTIVEC")
 (V1TI  "TARGET_ALTIVEC")])

+;; Mode iterator for logical operations on 128-bit vector types
+(define_mode_iterator BOOL_128_V   [(V16QI "TARGET_ALTIVEC")
+(V8HI  "TARGET_ALTIVEC")
+(V4SI  "TARGET_ALTIVEC")
+(V4SF  "TARGET_ALTIVEC")
+(V2DI  "TARGET_ALTIVEC")
+(V2DF  "TARGET_ALTIVEC")
+(V1TI  "TARGET_ALTIVEC")])
+
 ;; For the GPRs we use 3 constraints for register outputs, two that are the
 ;; same as the output register, and a third where the output register is an
 ;; early clobber, so we don't have to deal with register overlaps.  For the
@@ -7135,23 +7144,23 @@ (define_expand "subti3"
 ;; 128-bit logical operations expanders

 (define_expand "and3"
-  [(set (match_operand:BOOL_128 0 "vlogical_operand")
-   (and:BOOL_128 (match_operand:BOOL_128 1 "vlogical_operand")
- (match_operand:BOOL_128 2 "vlogical_operand")))]
+  [(set (match_operand:BOOL_128_V 0 "vlogical_operand")
+   (and:BOOL_128_V (match_operand:BOOL_128_V 1 "vlogical_operand")
+   (match_operand:BOOL_128_V 2 "vlogical_operand")))]
   ""
   "")

 (define_expand "ior3"
-  [(set (match_operand:BOOL_128 0 "vlogical_operand")
-(ior:BOOL_128 (match_operand:BOOL_128 1 "vlogical_operand")
- (match_operand:BOOL_128 2 "vlogical_operand")))]
+  [(set (match_operand:BOOL_128_V 0 "vlogical_operand")
+   (ior:BOOL_128_V (match_operand:BOOL_128_V 1 "vlogical_operand")
+   (match_operand:BOOL_128_V 2 "vlogical_operand")))]
   ""
   "")

 (define_expand "xor3"
-  [(set (match_operand:BOOL_128 0 "vlogical_operand")
-(xor:BOOL_128 (match_operand:BOOL_128 1 "vlogical_operand")
- (match_operand:BOOL_128 2 "vlogical_operand")))]
+  [(set (match_operand:BOOL_128_V 0 "vlogical_operand")
+   (xor:BOOL_128_V (match_operand:BOOL_128_V 1 "vlogical_operand")
+   (match_operand:BOOL_128_V 2 "vlogical_operand")))]
   ""
   "")

@@ -7449,7 +7458,14 @@ (define_insn_and_split "*eqv3_internal2"
 (const_string "16")))])

 ;; 128-bit one's complement
-(define_insn_and_split "one_cmpl2"
+(define_expand "one_cmpl2"
+[(set (match_operand:BOOL_128_V 0 "vlogical_operand" "=")
+   (not:BOOL_128_V
+ (match_operand:BOOL_128_V 1 "vlogical_operand" "")))]
+  ""
+  "")
+
+(define_insn_and_split "*one_cmpl2"
   [(set (match_operand:BOOL_128 0 "vlogical_operand" "=")
(not:BOOL_128
  (match_operand:BOOL_128 1 "vlogical_operand" "")))]
diff --git a/gcc/testsuite/gcc.target/powerpc/pr100694.c 
b/gcc/testsuite/gcc.target/powerpc/pr100694.c
new file mode 100644
index 000..96a895d6c44
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr100694.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target int128 } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 3 } } */
+
+/* It just needs two std and one blr.  */
+void foo (unsigned __int128* res, unsigned long long hi, unsigned long long lo)
+{
+   unsigned __int128 i = hi;
+   i <<= 64;
+   i |= lo;
+   *res = i;
+}
+
diff --git a/gcc/testsuite/gcc.target/powerpc/pr92398.c 

Re: [pushed] [PR103541] RA: Implement reuse of equivalent memory for caller saves optimization

2023-02-07 Thread Andrew Pinski via Gcc-patches
On Tue, Feb 7, 2023 at 6:08 AM Vladimir Makarov via Gcc-patches
 wrote:
>
> The following patch solves
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103541
>
> The patch was successfully bootstrapped and tested on x86-64, aarch64,
> and ppc64le.

What languages did you test? Because I think I am getting a bootstrap
failure while building libgo in 32bit x86 due to this patch.

libtool: compile:
/home/apinski/src/upstream-gcc-git/gcc/objdir/./gcc/gccgo
-B/home/apinski/src/upstream-gcc-git/gcc/objdir/./gcc/
-B/home/apinski/upstream-gcc/x86_64-pc-linux-gnu/bin/
-B/home/apinski/upstream-gcc/x86_64-pc-linux-gnu/lib/ -isystem
/home/apinski/upstream-gcc/x86_64-pc-linux-gnu/include -isystem
/home/apinski/upstream-gcc/x86_64-pc-linux-gnu/sys-include
-minline-all-stringops -O2 -g -m32 -I . -c -fgo-pkgpath=reflect
/home/apinski/src/upstream-gcc-git/gcc/libgo/go/reflect/deepequal.go
/home/apinski/src/upstream-gcc-git/gcc/libgo/go/reflect/eqtype.go
/home/apinski/src/upstream-gcc-git/gcc/libgo/go/reflect/makefunc.go
/home/apinski/src/upstream-gcc-git/gcc/libgo/go/reflect/makefunc_ffi.go
/home/apinski/src/upstream-gcc-git/gcc/libgo/go/reflect/swapper.go
/home/apinski/src/upstream-gcc-git/gcc/libgo/go/reflect/type.go
/home/apinski/src/upstream-gcc-git/gcc/libgo/go/reflect/value.go
/home/apinski/src/upstream-gcc-git/gcc/libgo/go/reflect/visiblefields.go
 -fPIC -o .libs/reflect.o
during RTL pass: reload
/home/apinski/src/upstream-gcc-git/gcc/libgo/go/reflect/type.go: In
function ‘reflect.rtype.Method’:
/home/apinski/src/upstream-gcc-git/gcc/libgo/go/reflect/type.go:604:1:
internal compiler error: in get_equiv, at lra-constraints.cc:534
  604 | func (t *rtype) Method(i int) (m Method) {
  | ^
0x7f9cfb get_equiv
/home/apinski/src/upstream-gcc-git/gcc/gcc/lra-constraints.cc:534
0xde58d4 lra_constraints(bool)
/home/apinski/src/upstream-gcc-git/gcc/gcc/lra-constraints.cc:5052
0xdd2532 lra(_IO_FILE*)
/home/apinski/src/upstream-gcc-git/gcc/gcc/lra.cc:2375
0xd8a101 do_reload
/home/apinski/src/upstream-gcc-git/gcc/gcc/ira.cc:5955
0xd8a101 execute
/home/apinski/src/upstream-gcc-git/gcc/gcc/ira.cc:6141
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.
Makefile:3043: recipe for target 'reflect.lo' failed
make[7]: *** [reflect.lo] Error 1


Thanks,
Andrew Pinski


[PATCH] RISC-V: Fix indent [NFC]

2023-02-07 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/ChangeLog:

* config/riscv/vector-iterators.md: Fix indent.
---
 gcc/config/riscv/vector-iterators.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index 858415bd6b1..c3f3f1b01f8 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -34,8 +34,8 @@
   UNSPEC_VMULHU
   UNSPEC_VMULHSU
 
-   UNSPEC_VADC
-   UNSPEC_VSBC
+  UNSPEC_VADC
+  UNSPEC_VSBC
 ])
 
 (define_mode_iterator V [
-- 
2.36.1




[PATCH] RISC-V: Add vadc/vsbc C/C++ API support

2023-02-07 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/ChangeLog:

* config/riscv/riscv-protos.h (simm5_p): Add vadc/vsbc support.
* config/riscv/riscv-v.cc (simm32_p): Ditto.
* config/riscv/riscv-vector-builtins-bases.cc (class vadc): New class.
(class vsbc): Ditto.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def (vadc): Ditto.
(vsbc): Ditto.
* config/riscv/riscv-vector-builtins-shapes.cc (struct 
no_mask_policy_def): Ditto.
(SHAPE): Ditto.
* config/riscv/riscv-vector-builtins-shapes.h: Ditto.
* config/riscv/riscv-vector-builtins.cc 
(rvv_arg_type_info::get_base_vector_type): Add vadc/vsbc support.
(rvv_arg_type_info::get_tree_type): Ditto.
(function_expander::use_exact_insn): Ditto.
* config/riscv/riscv-vector-builtins.h (enum rvv_base_type): Ditto.
(function_base::use_mask_predication_p): New function.
* config/riscv/vector-iterators.md: New iterator.
* config/riscv/vector.md (@pred_adc): New pattern.
(@pred_sbc): Ditto.
(@pred_adc_scalar): Ditto.
(@pred_sbc_scalar): Ditto.
(*pred_adc_scalar): Ditto.
(*pred_adc_extended_scalar): Ditto.
(*pred_sbc_scalar): Ditto.
(*pred_sbc_extended_scalar): Ditto.

---
 gcc/config/riscv/riscv-protos.h   |   1 +
 gcc/config/riscv/riscv-v.cc   |   2 +-
 .../riscv/riscv-vector-builtins-bases.cc  |  46 +++
 .../riscv/riscv-vector-builtins-bases.h   |   2 +
 .../riscv/riscv-vector-builtins-functions.def |   4 +
 .../riscv/riscv-vector-builtins-shapes.cc |  22 ++
 .../riscv/riscv-vector-builtins-shapes.h  |   1 +
 gcc/config/riscv/riscv-vector-builtins.cc |  50 ++-
 gcc/config/riscv/riscv-vector-builtins.h  |  12 +
 gcc/config/riscv/vector-iterators.md  |   3 +
 gcc/config/riscv/vector.md| 295 +-
 11 files changed, 430 insertions(+), 8 deletions(-)

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index e090c61a3e3..a4476e6235f 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -175,6 +175,7 @@ enum mask_policy get_prefer_mask_policy ();
 rtx get_avl_type_rtx (enum avl_type);
 opt_machine_mode get_vector_mode (scalar_mode, poly_uint64);
 extern bool simm32_p (rtx);
+extern bool simm5_p (rtx);
 extern bool neg_simm5_p (rtx);
 #ifdef RTX_CODE
 extern bool has_vi_variant_p (rtx_code, rtx);
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index df89c9be308..83cb1f83606 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -408,7 +408,7 @@ simm32_p (rtx x)
   return val <= 0x7FFFULL || val >= 0x8000ULL;
 }
 
-static bool
+bool
 simm5_p (rtx x)
 {
   if (!CONST_INT_P (x))
diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index e14a1854eee..8bf5bb97ca1 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -296,6 +296,48 @@ public:
   }
 };
 
+/* Implements vadc.  */
+class vadc : public function_base
+{
+public:
+  bool apply_mask_policy_p () const override { return false; }
+  bool use_mask_predication_p () const override { return false; }
+
+  rtx expand (function_expander ) const override
+  {
+switch (e.op_info->op)
+  {
+  case OP_TYPE_vvm:
+   return e.use_exact_insn (code_for_pred_adc (e.vector_mode ()));
+  case OP_TYPE_vxm:
+   return e.use_exact_insn (code_for_pred_adc_scalar (e.vector_mode ()));
+  default:
+   gcc_unreachable ();
+  }
+  }
+};
+
+/* Implements vsbc.  */
+class vsbc : public function_base
+{
+public:
+  bool apply_mask_policy_p () const override { return false; }
+  bool use_mask_predication_p () const override { return false; }
+
+  rtx expand (function_expander ) const override
+  {
+switch (e.op_info->op)
+  {
+  case OP_TYPE_vvm:
+   return e.use_exact_insn (code_for_pred_sbc (e.vector_mode ()));
+  case OP_TYPE_vxm:
+   return e.use_exact_insn (code_for_pred_sbc_scalar (e.vector_mode ()));
+  default:
+   gcc_unreachable ();
+  }
+  }
+};
+
 static CONSTEXPR const vsetvl vsetvl_obj;
 static CONSTEXPR const vsetvl vsetvlmax_obj;
 static CONSTEXPR const loadstore vle_obj;
@@ -354,6 +396,8 @@ static CONSTEXPR const widen_binopvwmulu_obj;
 static CONSTEXPR const vwmulsu vwmulsu_obj;
 static CONSTEXPR const vwcvt vwcvt_x_obj;
 static CONSTEXPR const vwcvt vwcvtu_x_obj;
+static CONSTEXPR const vadc vadc_obj;
+static CONSTEXPR const vsbc vsbc_obj;
 static CONSTEXPR const binop vsadd_obj;
 static CONSTEXPR const binop vssub_obj;
 static CONSTEXPR const binop vsaddu_obj;
@@ -422,6 +466,8 @@ BASE (vwmulu)
 BASE (vwmulsu)
 BASE (vwcvt_x)
 BASE (vwcvtu_x)
+BASE (vadc)
+BASE (vsbc)
 BASE (vsadd)
 BASE (vssub)
 

Re: [PATCH 1/2] Handle component_ref to a structre/union field including flexible array member [PR101832]

2023-02-07 Thread Joseph Myers
On Tue, 7 Feb 2023, Qing Zhao via Gcc-patches wrote:

> Then, this routine (flexible_array_type_p) is mainly for diagnostic purpose.
> It cannot be used to determine whether the structure/union type recursively
> include a flexible array member at the end.
> 
> Is my understanding correct?

My comments were about basic principles of what gets diagnosed, and the 
need for different predicates in different contexts; I wasn't trying to 
assert anything about how that maps onto what functions should be used in 
what contexts.

> >> 2. Only C99 standard flexible array member be included, [0] and [1] are 
> >> not included, for example:
> > 
> > Obviously we can't diagnose use of structures with [1] trailing members, 
> > because it's perfectly valid to embed those structures at any position 
> > inside other structures.  And the same is the case for the [0] extension 
> > when it's used to mean "empty array" rather than "flexible array".
> 
> With the -fstrict-flex-arrays available, we should be able to diagnose
> the flexible array member per gnu extension (i.e [0] or [1]) the same as []. 

There are different sorts of diagnostic that might be involved.

* Simply having [0] or [1] at the end of a structure embedded in another 
structure isn't appropriate to diagnose, because [0] and [1] have 
perfectly good meanings in such a context that aren't trying to be 
flexible array members at all.  [0] might be an empty type (possibly one 
that wouldn't be empty when built with a different configuration).  [1] 
might be the use of arrays in C to produce a passed-by-reference type.

* Trying to use such an embedded [0] or [1] array as if it were a flexible 
array member - i.e. accessing any member of the [0] array, or any member 
other than the [0] member of the [1] array - *is* a sign of the 
problematic use as a flexible array member, that might be appropriate to 
diagnose.  (Actually I'd guess the array index tends to be non-constant in 
accesses, and it would be odd to use a non-constant index when you mean 
that constant always to be 0, which it would need to be in the 
non-flexible case.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: GSoC project idea: Separate Host Process Offloading

2023-02-07 Thread Thomas Schwinge
Hi!

On 2023-02-02T22:13:20+0100, I wrote:
> On 2023-02-01T16:12:07+0100, Martin Jambor  wrote:
>> On Thu, Oct 20 2022, Richard Biener via Gcc-patches wrote:
 Am 20.10.2022 um 14:41 schrieb Jakub Jelinek via Gcc-patches 
 :
 On Thu, Oct 20, 2022 at 12:33:28PM +, Michael Matz wrote:
>> On Thu, 20 Oct 2022, Thomas Schwinge wrote:
>> This had been done in
>> wwwdocs commit 5c7ecfb5627e412a3d142d8dc212f4cd39b3b73f
>> "Document deprecation of OpenMP MIC offloading in GCC 12".
>>
>> I'm sad about this, because -- in theory -- such a plugin is very useful
>> for offloading simulation/debugging (separate host/device memory spaces,
>> allow sanitizers to run on offloaded code
>
> Yeah, I think that's a _very_ useful feature, but indeed ...
>
>> (like LLVM a while ago
>> implemented), and so on), but all that doesn't help -- in practice -- if
>> nobody is maintaining that code.
>
> ... it should then be somewhat maintained properly.  Maybe the
> MIC-specifics could be removed from the code, and it could be transformed
> into a "null"-offload target, as example and testing vehicle (and implying
> that such new liboffloadmic^H^H^Hnull would have its upstream in the GCC
> repo).  Alas, if noone is going to do that work removing is the right
> choice.

 Yeah.  But we really shouldn't need a large MIC specific library for that,
 everything should be implementable with a simple portable plugin that just
 forks + execs the offloading ELF and transfers data to/out of it etc.
 And the config/i386/intelmic-mkoffload etc. stuff would need to be done
 somewhere in generic code, such that we can do it for all targets.
 Also ideally by using just the normal lto1 with some special option that
 it acts as an offloading compiler, so that we don't need to bother with
 building a separate offloading compiler for it.
 True, everything guarded with #ifdef ACCEL_COMPILER etc. would need to
 change into code guarded with some option.
>>>
>>> Might be a nice GSoC project …
>>
>> I really think it could be.
>
> Agreed!  Something like: "Separate Host Process Offloading"!  (Back
> in October, I actually had made a TODO note to put this one onto
> , but so far...)
>
>> Would any one of those involved in this
>> thread be willing to mentor it?
>
> I'd offer to co-mentor, but I'd rather not be the only one.

Still looking for someone to join, please!  :-)

> I'm now off for FOSDEM, but unless someone gets it done before, I'll cook
> up a GSoC project idea text when I'm back, on Tuesday.

Here it is:
.
Please have a look, and fix up if necessary.


Grüße
 Thomas
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


Re: [Patch] libgomp: Fix reverse-offload for GOMP_MAP_TO_PSET

2023-02-07 Thread Thomas Schwinge
Hi Tobias!

On 2023-02-06T12:52:11+0100, Tobias Burnus  wrote:
> Seems as if I missed a GOMP_MAP_TO_PSET issue before. As nvptx is
> XFAILed before, I only found it when testing on AMDGCN.
>
> For an array-descriptor 'ai' variable, omplower has:
>map(tofrom:MEM  [(integer(kind=4)[0:] *)D.4346] [len: 
> D.4345])
>map(to:ai [pointer set, len: 64])
>map(alloc:ai.data [pointer assign, bias: 0])
>
> The latter reaches GCC with the same address as 'ai' – i.e. the
> one of the array descriptor. This then needs to be dereferenced
> to get the address of the actual pointer.
> The patch assumes (and asserts) that 'ai.data' is part of the 'ai'
> such that I can use the host address of 'ai' to access the data.
> If that's not guaranteed, we have to find another way (e.g. another
> lookup). But so far it seems to hold and I have not seen a bias
> other than 0.
>
> With that patch, libgomp.fortran/reverse-offload-5.f90 now works
> with AMDGCN.

Confirming the latter, thanks.  (..., but I've not actually reviewed the
code changes.)

> OK? Any comments to the attached patch?

Yes:

> libgomp: Fix reverse-offload for GOMP_MAP_TO_PSET
>
> libgomp/
>   * target.c (gomp_target_rev): Dereference ptr
>   to get device address.
>   * libgomp.fortran/reverse-offload-5.f90: Add test
>   for unallocated allocatable.
>
>  libgomp/target.c| 7 ++-
>  libgomp/testsuite/libgomp.fortran/reverse-offload-5.f90 | 6 --
>  2 files changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/libgomp/target.c b/libgomp/target.c
> index c1682caea13..5cdd845291a 100644
> --- a/libgomp/target.c
> +++ b/libgomp/target.c
> @@ -3579,8 +3579,13 @@ gomp_target_rev (uint64_t fn_ptr, uint64_t mapnum, 
> uint64_t devaddrs_ptr,
> }
>   int k;
>   n2 = NULL;
> - cdata[i].present = true;
> + /* Dereference devaddrs[j] to get the device addr.  */
> + assert (devaddrs[j]-sizes[j] == cdata[i].devaddr);
> + devaddrs[j] = *(uint64_t *) (devaddrs[i] + sizes[j]);

For x86_64-pc-linux-gnu '-m32' multilib:

[...]
[...]/source-gcc/libgomp/target.c: In function ‘gomp_target_rev’:
[...]/source-gcc/libgomp/target.c:3615:36: error: cast to pointer from 
integer of different size [-Werror=int-to-pointer-cast]
 3615 | devaddrs[j] = *(uint64_t *) (devaddrs[i] + 
sizes[j]);
  |^
cc1: all warnings being treated as errors
make[9]: *** [target.lo] Error 1
make[9]: Leaving directory `[...]/build-gcc/x86_64-pc-linux-gnu/32/libgomp'
[...]

I suppose you'd do similar to what you already have a few lines above;
that is, cast through 'uintptr_t':

devaddrs[j] = *(uint64_t *) (uintptr_t) (devaddrs[i] + sizes[j]);


Grüße
 Thomas


> + cdata[j].present = true;
>   cdata[j].devaddr = devaddrs[j];
> + if (devaddrs[j] == 0)
> +   continue;
>   k = gomp_map_cdata_lookup (cdata, devaddrs, kinds, sizes, j,
>  devaddrs[j],
>  devaddrs[j] + sizeof (void*),
> diff --git a/libgomp/testsuite/libgomp.fortran/reverse-offload-5.f90 
> b/libgomp/testsuite/libgomp.fortran/reverse-offload-5.f90
> index ef7eb7bdd52..16810eb47de 100644
> --- a/libgomp/testsuite/libgomp.fortran/reverse-offload-5.f90
> +++ b/libgomp/testsuite/libgomp.fortran/reverse-offload-5.f90
> @@ -24,7 +24,7 @@ s2 = 55
>
>  !$omp target map(to: A, A2, s1, s2)
>  block
> -  integer, allocatable :: ai(:), ai2(:), si1, si2
> +  integer, allocatable :: ai(:), ai2(:), ai3(:), si1, si2, si3
>
>a = a * 2
>a2 = a2 * 3
> @@ -38,7 +38,7 @@ block
>
>!$omp target device (ancestor:1)  &
>!$omp&   map(to: A, s1, ai, si1) map(always, to: a2, s2)  &
> -  !$omp&   map(tofrom: ai2, si2)
> +  !$omp&   map(tofrom: ai2, si2, ai3, si3)
>  if (shared_mem) then
>if (any (a  /= 2 * [1,2,3,4])) stop 1
>if (s1 /= 4 * 532) stop 2
> @@ -52,6 +52,7 @@ block
>  if (any (ai2 /= [8,4,7,1])) stop 8
>  if (si1 /= 64) stop 9
>  if (si2 /= 765) stop 10
> +if (allocated (ai3) .or. allocated(si3)) stop 26
>
>  a = a*3
>  a2 = a2*7
> @@ -80,6 +81,7 @@ block
>endif
>if (any (ai2 /= 21 * [8,4,7,1])) stop 24
>if (si2 /= 31 * 765) stop 25
> +  if (allocated (ai3) .or. allocated(si3)) stop 27
>
>deallocate (ai, ai2, si1, si2)
>  end block
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


Fix 'libgomp.fortran/reverse-offload-6.f90' nvptx offloading compilation (was: [Patch] libgomp: Fix reverse offload issues)

2023-02-07 Thread Thomas Schwinge
Hi Tobias!

On 2023-02-03T11:38:29+0100, Tobias Burnus  wrote:
> Now committed as obvious as
> r13-5680-g0b1ce70a813b98ef2893779d14ad6c90c5d06a71.

I see:

FAIL: libgomp.fortran/reverse-offload-6.f90   -O  (test for excess errors)

... due to:

nvptx mkoffload: warning: 'omp requires reverse_offload' requires at least 
'sm_35' for '-foffload-options=nvptx-none=-march=' - disabling offload-code 
generation for this device type

I've pushed to master branch
commit 7ab75a6e6dfd81c566f848ef0d8d0ee15b81
"Fix 'libgomp.fortran/reverse-offload-6.f90' nvptx offloading compilation",
see attached.


Grüße
 Thomas


> I improved the wording in the commit comment a bit, compared to previous
> attachment and I have verified that those features work with AMDGCN* and
> without offloading.
>
> Tobias
>
> (* it seems as if there is still another issue with mapping, this time
> for array-descriptor variables that do not exist on the host, i.e. that
> have to be fully mapped to the host. I will look into this today or Monday.)
>
> On 02.02.23 15:13, Tobias Burnus wrote:
>> Found when testing AMD GCN offloading, the second issue came up with
>> libgomp.fortran/reverse-offload-5.f90. (But oddly not with nvptx.)
>>
>> While the first one (new test: libgomp.fortran/reverse-offload-6.f90)
>> came up when debugging the issue.
>>
>> Tobias


> commit 0b1ce70a813b98ef2893779d14ad6c90c5d06a71
> Author: Tobias Burnus 
> Date:   Fri Feb 3 11:31:53 2023 +0100
>
> libgomp: Fix reverse offload issues
>
> If there is nothing to map, skip the mapping and avoid attempting to
> copy 0 bytes from addrs, sizes and kinds.
>
> Additionally, it could happen that a non-allocated address was 
> deallocated,
> such as a pointer set, leading to a free for the actual data.
>
> libgomp/
> * target.c (gomp_target_rev): Handle mapnum == 0 and avoid
> freeing not allocated memory.
> * testsuite/libgomp.fortran/reverse-offload-6.f90: New test.
> ---
>  libgomp/target.c   |  8 +++---
>  .../libgomp.fortran/reverse-offload-6.f90  | 32 
> ++
>  2 files changed, 36 insertions(+), 4 deletions(-)
>
> diff --git a/libgomp/target.c b/libgomp/target.c
> index b16ee761a95..c1682caea13 100644
> --- a/libgomp/target.c
> +++ b/libgomp/target.c
> @@ -3324,7 +3324,7 @@ gomp_target_rev (uint64_t fn_ptr, uint64_t mapnum, 
> uint64_t devaddrs_ptr,
>  gomp_fatal ("Cannot find reverse-offload function");
>void (*host_fn)() = (void (*)()) n->k->host_start;
>
> -  if (devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
> +  if ((devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM) || mapnum == 0)
>  {
>devaddrs = (uint64_t *) (uintptr_t) devaddrs_ptr;
>sizes = (uint64_t *) (uintptr_t) sizes_ptr;
> @@ -3402,7 +3402,7 @@ gomp_target_rev (uint64_t fn_ptr, uint64_t mapnum, 
> uint64_t devaddrs_ptr,
> }
>  }
>
> -  if (!(devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM))
> +  if (!(devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM) && mapnum > 0)
>  {
>size_t j, struct_cpy = 0;
>splay_tree_key n2;
> @@ -3638,7 +3638,7 @@ gomp_target_rev (uint64_t fn_ptr, uint64_t mapnum, 
> uint64_t devaddrs_ptr,
>
>host_fn (devaddrs);
>
> -  if (!(devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM))
> +  if (!(devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM) && mapnum > 0)
>  {
>uint64_t struct_cpy = 0;
>bool clean_struct = false;
> @@ -3680,7 +3680,7 @@ gomp_target_rev (uint64_t fn_ptr, uint64_t mapnum, 
> uint64_t devaddrs_ptr,
> clean_struct = true;
> struct_cpy = sizes[i];
>   }
> -   else if (cdata[i].aligned)
> +   else if (!cdata[i].present && cdata[i].aligned)
>   gomp_aligned_free ((void *) (uintptr_t) devaddrs[i]);
> else if (!cdata[i].present)
>   free ((void *) (uintptr_t) devaddrs[i]);
> diff --git a/libgomp/testsuite/libgomp.fortran/reverse-offload-6.f90 
> b/libgomp/testsuite/libgomp.fortran/reverse-offload-6.f90
> new file mode 100644
> index 000..04866edbba7
> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.fortran/reverse-offload-6.f90
> @@ -0,0 +1,32 @@
> +!
> +! Ensure that a mapping with no argument works
> +!
> +
> +module m
> +  implicit none (type, external)
> +  integer :: x = 32
> +  integer :: dev_num2 = -1
> +contains
> +subroutine  foo()
> +  use omp_lib, only: omp_get_device_num
> +  x = x + 10
> +  dev_num2 = omp_get_device_num()
> +end
> +end module m
> +
> +use m
> +use omp_lib
> +!$omp requires reverse_offload
> +implicit none (type, external)
> +integer :: dev_num = -1
> +!$omp target map(from:dev_num)
> +  dev_num = omp_get_device_num()
> +  ! This calls GOMP_target_ext with number of maps = 0
> +  !$omp target device(ancestor:1)
> +call foo
> +  !$omp end target
> +!$omp end target
> +
> +if (omp_get_num_devices() > 0 .and.  dev_num2 == dev_num) stop 1
> +if (x /= 

[pushed] analyzer: fix -Wanalyzer-use-of-uninitialized-value false +ve on "read" [PR108661]

2023-02-07 Thread David Malcolm via Gcc-patches
My integration testing shows many false positives from
-Wanalyzer-use-of-uninitialized-value.

One cause turns out to be that as of r13-1404-g97baacba963c06
fd_state_machine::on_stmt recognizes calls to "read", and returns true,
so that region_model::on_call_post doesn't call handle_unrecognized_call
on them, and so the analyzer erroneously "thinks" that the buffer
pointed to by "read" is never touched by the "read" call.

This works for "fread" because sm-file.cc implements kf_fread, which
handles calls to "fread" by clobbering the buffer pointed to.  In the
long term we should probably be smarter about this and bifurcate the
analysis to consider e.g. errors vs full reads vs partial reads, etc
(which I'm tracking in PR analyzer/108689).

In the meantime, this patch adds a kf_read for "read" analogous to the
one for "fread", fixing 6 false positives seen in git-2.39.0 and
2 in haproxy-2.7.1.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r13-5733-gc300e251f5b22d.

gcc/analyzer/ChangeLog:
PR analyzer/108661
* sm-fd.cc (class kf_read): New.
(register_known_fd_functions): Register "read".
* sm-file.cc (class kf_fread): Update comment.

gcc/testsuite/ChangeLog:
PR analyzer/108661
* gcc.dg/analyzer/fread-pr108661.c: New test.
* gcc.dg/analyzer/read-pr108661.c: New test.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/sm-fd.cc | 33 +++
 gcc/analyzer/sm-file.cc   | 10 -
 .../gcc.dg/analyzer/fread-pr108661.c  | 40 +++
 gcc/testsuite/gcc.dg/analyzer/read-pr108661.c | 33 +++
 4 files changed, 115 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/fread-pr108661.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/read-pr108661.c

diff --git a/gcc/analyzer/sm-fd.cc b/gcc/analyzer/sm-fd.cc
index 494d802a1d4..d107390bc00 100644
--- a/gcc/analyzer/sm-fd.cc
+++ b/gcc/analyzer/sm-fd.cc
@@ -2659,6 +2659,38 @@ private:
   unsigned m_num_args;
 };
 
+/* Handler for "read".
+ ssize_t read(int fildes, void *buf, size_t nbyte);
+   See e.g. https://man7.org/linux/man-pages/man2/read.2.html   */
+
+class kf_read : public known_function
+{
+public:
+  bool matches_call_types_p (const call_details ) const final override
+  {
+return (cd.num_args () == 3
+   && cd.arg_is_pointer_p (1)
+   && cd.arg_is_size_p (2));
+  }
+
+  /* For now, assume that any call to "read" fully clobbers the buffer
+ passed in.  This isn't quite correct (e.g. errors, partial reads;
+ see PR analyzer/108689), but at least stops us falsely complaining
+ about the buffer being uninitialized.  */
+  void impl_call_pre (const call_details ) const final override
+  {
+region_model *model = cd.get_model ();
+const svalue *ptr_sval = cd.get_arg_svalue (1);
+if (const region *reg = ptr_sval->maybe_get_region ())
+  {
+   const region *base_reg = reg->get_base_region ();
+   const svalue *new_sval = cd.get_or_create_conjured_svalue (base_reg);
+   model->set_value (base_reg, new_sval, cd.get_ctxt ());
+  }
+  }
+};
+
+
 /* Populate KFM with instances of known functions relating to
file descriptors.  */
 
@@ -2672,6 +2704,7 @@ register_known_fd_functions (known_function_manager )
   kfm.add ("listen", make_unique ());
   kfm.add ("pipe", make_unique (1));
   kfm.add ("pipe2", make_unique (2));
+  kfm.add ("read", make_unique ());
   kfm.add ("socket", make_unique ());
 }
 
diff --git a/gcc/analyzer/sm-file.cc b/gcc/analyzer/sm-file.cc
index 9cb4e32e286..d99a09b76c4 100644
--- a/gcc/analyzer/sm-file.cc
+++ b/gcc/analyzer/sm-file.cc
@@ -560,7 +560,11 @@ public:
   }
 };
 
-/* Handler for "fread"".  */
+/* Handler for "fread".
+ size_t fread(void *restrict buffer, size_t size, size_t count,
+ FILE *restrict stream);
+   See e.g. https://en.cppreference.com/w/c/io/fread
+   and https://www.man7.org/linux/man-pages/man3/fread.3.html */
 
 class kf_fread : public known_function
 {
@@ -574,6 +578,10 @@ public:
&& cd.arg_is_pointer_p (3));
   }
 
+  /* For now, assume that any call to "fread" fully clobbers the buffer
+ passed in.  This isn't quite correct (e.g. errors, partial reads;
+ see PR analyzer/108689), but at least stops us falsely complaining
+ about the buffer being uninitialized.  */
   void impl_call_pre (const call_details ) const final override
   {
 region_model *model = cd.get_model ();
diff --git a/gcc/testsuite/gcc.dg/analyzer/fread-pr108661.c 
b/gcc/testsuite/gcc.dg/analyzer/fread-pr108661.c
new file mode 100644
index 000..b51cf41ec2a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/fread-pr108661.c
@@ -0,0 +1,40 @@
+typedef __SIZE_TYPE__ size_t;
+
+extern size_t fread (void *, size_t, size_t, void *);
+
+struct ring
+{
+  char buf[1024];
+};
+
+int
+test_one_large_item (void *fp)
+{
+  struct ring ring;
+  int 

[COMMITTED, GCC 12] Fix PR 108582: ICE due to PHI-OPT removing a still in use ssa_name.

2023-02-07 Thread Andrew Pinski via Gcc-patches
This patch adds a check in match_simplify_replacement to make sure the middlebb
does not have any phi-nodes as we don't currently move those.
This was just a thinko from before.

Committed on the GCC 12 branch after a bootstrap/test on x86_64-linux-gnu.

PR tree-optimization/108582

gcc/ChangeLog:

* tree-ssa-phiopt.cc (match_simplify_replacement): Add check
for middlebb to have no phi nodes.

gcc/testsuite/ChangeLog:

* gcc.dg/pr108582-1.c: New test.

(cherry picked from commit 876b3e0514bc8cb2256c44db56255403bedfa52d)
---
 gcc/testsuite/gcc.dg/pr108582-1.c | 58 +++
 gcc/tree-ssa-phiopt.cc|  5 +++
 2 files changed, 63 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/pr108582-1.c

diff --git a/gcc/testsuite/gcc.dg/pr108582-1.c 
b/gcc/testsuite/gcc.dg/pr108582-1.c
new file mode 100644
index 000..88c2de369ad
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr108582-1.c
@@ -0,0 +1,58 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fno-tree-ccp -fno-tree-dce" } */
+
+/*
+  PHI-OPT via match_simplify_replacement used to transform:
+  if (_25 != 0)
+goto ; [25.00%]
+  else
+goto ; [75.00%]
+
+   [local count: 11649864]:
+  # iftmp.5_13 = PHI <2(7)>
+  k_22 = k_11 | iftmp.5_13;
+
+   [local count: 105655256]:
+  # g_9 = PHI <1(2), 0(8), g_8(7)>
+  # k_12 = PHI 
+
+into:
+
+  _15 = (int) _25;
+  _28 = -_15;
+  _4 = _13 & _28;
+  _6 = _4 | k_11;
+
+   [local count: 105655256]:
+  # g_9 = PHI <1(2), g_8(7)>
+  # k_12 = PHI 
+
+Removing the phi-node/assignment of _13.
+
+ */
+
+int a, c, d, e, f;
+char b;
+int main() {
+  int g = 1;
+  char h[1] = {0};
+  while (a) {
+if (f) {
+  b = 0;
+  if (d)
+continue;
+}
+if (a < 1) {
+  g = 0;
+  goto L;
+}
+  }
+  while (c) {
+char *j = h;
+int k;
+  L:
+if (e && !g)
+  k |= 2 | (*j < 0);
+  }
+  return 0;
+}
diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
index 3eda825672c..b8e57bb470e 100644
--- a/gcc/tree-ssa-phiopt.cc
+++ b/gcc/tree-ssa-phiopt.cc
@@ -973,6 +973,11 @@ match_simplify_replacement (basic_block cond_bb, 
basic_block middle_bb,
   if (!single_pred_p (middle_bb))
return false;
 
+  /* The middle bb cannot have phi nodes as we don't
+move those assignments yet. */
+  if (!gimple_seq_empty_p (phi_nodes (middle_bb)))
+   return false;
+
   stmt_to_move = last_and_only_stmt (middle_bb);
   if (!stmt_to_move)
return false;
-- 
2.27.0



Re: [PATCH] tree-optimization/108522 Use component_ref_field_offset

2023-02-07 Thread Siddhesh Poyarekar




On 2023-01-25 22:32, Siddhesh Poyarekar wrote:

Instead of using TREE_OPERAND (expr, 2) directly, use
component_ref_field_offset instead, which does scaling for us.  The
function also substitutes PLACEHOLDER_EXPRs, which is probably what we
want anyway but I'm not sure if it's relevant for tree-object-size.

gcc/ChangeLog:

PR tree-optimization/108522
* tree-object-size.cc (compute_object_offset): Make EXPR
argument non-const.  Call component_ref_field_offset.

gcc/testsuite/ChangeLog:

PR tree-optimization/108522
* gcc.dg/builtin-dynamic-object-size-0.c (DEFSTRUCT): New
macro.
(test_dynarray_struct_member_b, test_dynarray_struct_member_c,
test_dynarray_struct_member_d,
test_dynarray_struct_member_subobj_b,
test_dynarray_struct_member_subobj_c,
test_dynarray_struct_member_subobj_d): New tests.
(main): Call them.

Signed-off-by: Siddhesh Poyarekar 


... and now pushed (this and the earlier commit) to gcc-12 branch.

Sid


Re: [PATCH 1/2] Handle component_ref to a structre/union field including flexible array member [PR101832]

2023-02-07 Thread Qing Zhao via Gcc-patches


> On Feb 7, 2023, at 2:17 PM, Joseph Myers  wrote:
> 
> On Tue, 7 Feb 2023, Qing Zhao via Gcc-patches wrote:
> 
>> 1.  Structure with flexible array member embedded into other structures 
>> recursively, for example:
>> 
>> struct A {
>>  int n;
>>  char data[];
>> };
>> 
>> struct B {
>>  int m;
>>  struct A a;
>> };
>> 
>> struct C {
>>  int q;
>>  struct B b;
>> };
>> 
>> In the above, “struct C” will not be caught by this routine.
> 
> Because struct B is diagnosed with -pedantic when it embed struct A, there 
> is no need for -pedantic to diagnose struct C as well when it embeds 
> struct B.

Oh, yes.
Then, this routine (flexible_array_type_p) is mainly for diagnostic purpose.
It cannot be used to determine whether the structure/union type recursively
include a flexible array member at the end.

Is my understanding correct?

> 
>> 2. Only C99 standard flexible array member be included, [0] and [1] are 
>> not included, for example:
> 
> Obviously we can't diagnose use of structures with [1] trailing members, 
> because it's perfectly valid to embed those structures at any position 
> inside other structures.  And the same is the case for the [0] extension 
> when it's used to mean "empty array" rather than "flexible array".

With the -fstrict-flex-arrays available, we should be able to diagnose
the flexible array member per gnu extension (i.e [0] or [1]) the same as []. 

> 
> Note that my comments above are about what diagnostics are appropriate 
> under the standard.  They are *not* about how code generation might allow 
> for possible uses of certain source code constructs as if they were 
> flexible array members.  The two contexts may very well require different 
> notions of what counts as a flexible array member.

Yes. That’s right.

Thanks.

Qing
> 
> -- 
> Joseph S. Myers
> jos...@codesourcery.com



[PATCH] Fortran: error handling of global entity appearing in COMMON block [PR103259]

2023-02-07 Thread Harald Anlauf via Gcc-patches
Dear all,

the attached trivial patch by Steve fixes a NULL pointer dereference
that occurs when an error shall be emitted for a global entity that
conflicts with a symbol appearing in a COMMON block, but the symbol's
location is not set.  This may happen e.g. in the testcase in the PR,
where the COMMON variables are imported from a module.

Regtested on x86_64-pc-linux-gnu with no failures.

I intend to commit as obvious within 24h unless there are comments.

Thanks,
Harald

From 1c225608e0cabbe909cc4d6e69fb0281bb99418f Mon Sep 17 00:00:00 2001
From: Steve Kargl 
Date: Tue, 7 Feb 2023 20:18:42 +0100
Subject: [PATCH] Fortran: error handling of global entity appearing in COMMON
 block [PR103259]

gcc/fortran/ChangeLog:

	PR fortran/103259
	* resolve.cc (resolve_common_vars): Avoid NULL pointer dereference
	when a symbol's location is not set.

gcc/testsuite/ChangeLog:

	PR fortran/103259
	* gfortran.dg/pr103259.f90: New test.
---
 gcc/fortran/resolve.cc | 12 +---
 gcc/testsuite/gfortran.dg/pr103259.f90 | 11 +++
 2 files changed, 20 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/pr103259.f90

diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc
index 100b2382e22..549916c2b53 100644
--- a/gcc/fortran/resolve.cc
+++ b/gcc/fortran/resolve.cc
@@ -953,9 +953,15 @@ resolve_common_vars (gfc_common_head *common_block, bool named_common)
 {
   gsym = gfc_find_gsymbol (gfc_gsym_root, csym->name);
   if (gsym && (gsym->type == GSYM_MODULE || gsym->type == GSYM_PROGRAM))
-	gfc_error_now ("Global entity %qs at %L cannot appear in a "
-			"COMMON block at %L", gsym->name,
-			>where, >common_block->where);
+	{
+	  if (csym->common_block)
+	gfc_error_now ("Global entity %qs at %L cannot appear in a "
+			   "COMMON block at %L", gsym->name,
+			   >where, >common_block->where);
+	  else
+	gfc_error_now ("Global entity %qs at %L cannot appear in a "
+			   "COMMON block", gsym->name, >where);
+	}

   /* gfc_add_in_common may have been called before, but the reported errors
 	 have been ignored to continue parsing.
diff --git a/gcc/testsuite/gfortran.dg/pr103259.f90 b/gcc/testsuite/gfortran.dg/pr103259.f90
new file mode 100644
index 000..f78ff26180e
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr103259.f90
@@ -0,0 +1,11 @@
+! { dg-do compile }
+! PR fortran/103259 - ICE in resolve_common_vars
+! Contributed by G.Steinmetz
+
+module m
+  integer :: p
+  common /c/ p
+end
+program p ! { dg-error "cannot appear in a COMMON block" }
+  use m   ! { dg-error "is also the name of the current program unit" }
+end
--
2.35.3



Re: [PATCH 1/2] Handle component_ref to a structre/union field including flexible array member [PR101832]

2023-02-07 Thread Joseph Myers
On Tue, 7 Feb 2023, Qing Zhao via Gcc-patches wrote:

> 1.  Structure with flexible array member embedded into other structures 
> recursively, for example:
> 
> struct A {
>   int n;
>   char data[];
> };
> 
> struct B {
>   int m;
>   struct A a;
> };
> 
> struct C {
>   int q;
>   struct B b;
> };
> 
> In the above, “struct C” will not be caught by this routine.

Because struct B is diagnosed with -pedantic when it embed struct A, there 
is no need for -pedantic to diagnose struct C as well when it embeds 
struct B.

> 2. Only C99 standard flexible array member be included, [0] and [1] are 
> not included, for example:

Obviously we can't diagnose use of structures with [1] trailing members, 
because it's perfectly valid to embed those structures at any position 
inside other structures.  And the same is the case for the [0] extension 
when it's used to mean "empty array" rather than "flexible array".

Note that my comments above are about what diagnostics are appropriate 
under the standard.  They are *not* about how code generation might allow 
for possible uses of certain source code constructs as if they were 
flexible array members.  The two contexts may very well require different 
notions of what counts as a flexible array member.

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH] testsuite: Generalize check_effective_target_lra

2023-02-07 Thread Hans-Peter Nilsson via Gcc-patches
Tested native x86_64-pc-linux-gnu and cris-elf (non-LRA and
also hacked to switch to LRA).

Ok to commit?

--- 8< ---
The LRA target list is incomplete.  Rather than syncing it with actual
LRA targets, better use existing infrastructure and look for a
LRA-specific pattern in the reload dump (which has the same name, but
completely different contents).

* lib/target-supports.exp: Replace target list with looking for
a LRA-specific string in the reload dump.
---
 gcc/testsuite/lib/target-supports.exp | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 227e3004077a..e62b7a2c869d 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -12192,10 +12192,11 @@ proc check_effective_target_o_flag_in_section { } {
 # return 1 if LRA is supported.
 
 proc check_effective_target_lra { } {
-if { [istarget hppa*-*-*] } {
-   return 0
-}
-return 1
+# Iterating over extended basic blocks is new with LRA.  Also need
+# a context to avoid spuriously matching a register name.
+return [check_no_messages_and_pattern lra "EBB 2 3" rtl-reload {
+   void foo (void) { }
+}]
 }
 
 # Test whether optimizations are enabled ('__OPTIMIZE__') per the
-- 
2.30.2



Re: [PATCH] testsuite: XFAIL bogus g++.dg/warn/Wstringop-overflow-4.C:144, PR106120

2023-02-07 Thread Hans-Peter Nilsson via Gcc-patches
> From: Aldy Hernandez 
> Date: Tue, 7 Feb 2023 17:52:02 +0100

> Up to the release managers, but I have no objections.

I take it that's for both patches.  Thanks!
(Potential reviewers: these patches are local to the testsuite.)

brgds, H-P


> 
> Aldy
> 
> On 2/7/23 17:50, Hans-Peter Nilsson wrote:
> > (sort-of-ping:) Aldy, I missed CC:ing you on the similar
> > https://gcc.gnu.org/pipermail/gcc-patches/2023-February/611206.html
> > would you mind having a look?
> > 
> > Tested native x86_64-pc-linux-gnu (w/wo. -m32) and cris-elf.
> > Ok to commit?
> > 
> >  8< 
> > There was a commit r13-2082-gbf13a13c65bd06 "c++: remove some xfails"
> > (not referencing the PR) that dealt with part of the PR, but didn't
> > xfail the ilp32-specific (bogus) warning mentioned in the PR.
> > 
> > PR testsuite/106120
> > * g++.dg/warn/Wstringop-overflow-4.C:144 XFAIL bogus warning for
> > ilp32 targets with c++98.
> > ---
> >   gcc/testsuite/g++.dg/warn/Wstringop-overflow-4.C | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/gcc/testsuite/g++.dg/warn/Wstringop-overflow-4.C 
> > b/gcc/testsuite/g++.dg/warn/Wstringop-overflow-4.C
> > index 3716d2d13be0..35fb59e02320 100644
> > --- a/gcc/testsuite/g++.dg/warn/Wstringop-overflow-4.C
> > +++ b/gcc/testsuite/g++.dg/warn/Wstringop-overflow-4.C
> > @@ -141,7 +141,7 @@ void test_strcpy_new_int16_t (size_t n, const size_t 
> > vals[])
> >   
> > int r_imin_imax = SR (INT_MIN, INT_MAX);
> > T (S (1), new int16_t[r_imin_imax]);
> > -  T (S (2), new int16_t[r_imin_imax + 1]);
> > +  T (S (2), new int16_t[r_imin_imax + 1]); // { dg-bogus "into a region of 
> > size" "pr106120" { xfail { ilp32 && c++98_only } } }
> > T (S (9), new int16_t[r_imin_imax * 2 + 1]);
> >   
> > int r_0_imax = SR (0, INT_MAX);
> 


Re: [PATCH] testsuite: XFAIL bogus g++.dg/warn/Wstringop-overflow-4.C:144, PR106120

2023-02-07 Thread Aldy Hernandez via Gcc-patches

Up to the release managers, but I have no objections.

Aldy

On 2/7/23 17:50, Hans-Peter Nilsson wrote:

(sort-of-ping:) Aldy, I missed CC:ing you on the similar
https://gcc.gnu.org/pipermail/gcc-patches/2023-February/611206.html
would you mind having a look?

Tested native x86_64-pc-linux-gnu (w/wo. -m32) and cris-elf.
Ok to commit?

 8< 
There was a commit r13-2082-gbf13a13c65bd06 "c++: remove some xfails"
(not referencing the PR) that dealt with part of the PR, but didn't
xfail the ilp32-specific (bogus) warning mentioned in the PR.

PR testsuite/106120
* g++.dg/warn/Wstringop-overflow-4.C:144 XFAIL bogus warning for
ilp32 targets with c++98.
---
  gcc/testsuite/g++.dg/warn/Wstringop-overflow-4.C | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/g++.dg/warn/Wstringop-overflow-4.C 
b/gcc/testsuite/g++.dg/warn/Wstringop-overflow-4.C
index 3716d2d13be0..35fb59e02320 100644
--- a/gcc/testsuite/g++.dg/warn/Wstringop-overflow-4.C
+++ b/gcc/testsuite/g++.dg/warn/Wstringop-overflow-4.C
@@ -141,7 +141,7 @@ void test_strcpy_new_int16_t (size_t n, const size_t vals[])
  
int r_imin_imax = SR (INT_MIN, INT_MAX);

T (S (1), new int16_t[r_imin_imax]);
-  T (S (2), new int16_t[r_imin_imax + 1]);
+  T (S (2), new int16_t[r_imin_imax + 1]); // { dg-bogus "into a region of size" 
"pr106120" { xfail { ilp32 && c++98_only } } }
T (S (9), new int16_t[r_imin_imax * 2 + 1]);
  
int r_0_imax = SR (0, INT_MAX);




[PATCH] testsuite: XFAIL bogus g++.dg/warn/Wstringop-overflow-4.C:144, PR106120

2023-02-07 Thread Hans-Peter Nilsson via Gcc-patches
(sort-of-ping:) Aldy, I missed CC:ing you on the similar
https://gcc.gnu.org/pipermail/gcc-patches/2023-February/611206.html
would you mind having a look?

Tested native x86_64-pc-linux-gnu (w/wo. -m32) and cris-elf.
Ok to commit?

 8< 
There was a commit r13-2082-gbf13a13c65bd06 "c++: remove some xfails"
(not referencing the PR) that dealt with part of the PR, but didn't
xfail the ilp32-specific (bogus) warning mentioned in the PR.

PR testsuite/106120
* g++.dg/warn/Wstringop-overflow-4.C:144 XFAIL bogus warning for
ilp32 targets with c++98.
---
 gcc/testsuite/g++.dg/warn/Wstringop-overflow-4.C | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/g++.dg/warn/Wstringop-overflow-4.C 
b/gcc/testsuite/g++.dg/warn/Wstringop-overflow-4.C
index 3716d2d13be0..35fb59e02320 100644
--- a/gcc/testsuite/g++.dg/warn/Wstringop-overflow-4.C
+++ b/gcc/testsuite/g++.dg/warn/Wstringop-overflow-4.C
@@ -141,7 +141,7 @@ void test_strcpy_new_int16_t (size_t n, const size_t vals[])
 
   int r_imin_imax = SR (INT_MIN, INT_MAX);
   T (S (1), new int16_t[r_imin_imax]);
-  T (S (2), new int16_t[r_imin_imax + 1]);
+  T (S (2), new int16_t[r_imin_imax + 1]); // { dg-bogus "into a region of 
size" "pr106120" { xfail { ilp32 && c++98_only } } }
   T (S (9), new int16_t[r_imin_imax * 2 + 1]);
 
   int r_0_imax = SR (0, INT_MAX);
-- 
2.30.2



[pushed] doc: Update -fchar8_t documentation

2023-02-07 Thread Marek Polacek via Gcc-patches
Since C++20 P2513R4, char8_t Compatibility and Portability Fix it is
no longer true that

  char ca[] = u8"xx";

causes an error so adjust the example for -fchar8_t.

Applied.

gcc/ChangeLog:

* doc/invoke.texi: Update -fchar8_t documentation.
---
 gcc/doc/invoke.texi | 2 --
 1 file changed, 2 deletions(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 1eda0e0396b..51447a78584 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -3057,8 +3057,6 @@ following code is well-formed under ISO C++11, but is 
ill-formed when
 @option{-fchar8_t} is specified.
 
 @smallexample
-char ca[] = u8"xx"; // error: char-array initialized from wide
-//string
 const char *cp = u8"xx";// error: invalid conversion from
 //`const char8_t*' to `const char*'
 int f(const char*);

base-commit: f661c0bb6371f355966a67b5ce71398e80792948
-- 
2.39.1



[PATCH v4] c++: -Wdangling-reference with reference wrapper [PR107532]

2023-02-07 Thread Marek Polacek via Gcc-patches
On Sun, Feb 05, 2023 at 05:25:25PM -0800, Jason Merrill wrote:
> On 1/24/23 17:49, Marek Polacek wrote:
> > On Fri, Jan 20, 2023 at 03:19:54PM -0500, Jason Merrill wrote:
> > > On 1/19/23 21:03, Marek Polacek wrote:
> > > > On Thu, Jan 19, 2023 at 01:02:02PM -0500, Jason Merrill wrote:
> > > > > On 1/18/23 20:13, Marek Polacek wrote:
> > > > > > On Wed, Jan 18, 2023 at 04:07:59PM -0500, Jason Merrill wrote:
> > > > > > > On 1/18/23 12:52, Marek Polacek wrote:
> > > > > > > > Here, -Wdangling-reference triggers where it probably 
> > > > > > > > shouldn't, causing
> > > > > > > > some grief.  The code in question uses a reference wrapper with 
> > > > > > > > a member
> > > > > > > > function returning a reference to a subobject of a 
> > > > > > > > non-temporary object:
> > > > > > > > 
> > > > > > > >   const Plane & meta = fm.planes().inner();
> > > > > > > > 
> > > > > > > > I've tried a few approaches, e.g., checking that the member 
> > > > > > > > function's
> > > > > > > > return type is the same as the type of the enclosing class 
> > > > > > > > (which is
> > > > > > > > the case for member functions returning *this), but that then 
> > > > > > > > breaks
> > > > > > > > Wdangling-reference4.C with std::optional.
> > > > > > > > 
> > > > > > > > So I figured that perhaps we want to look at the object we're 
> > > > > > > > invoking
> > > > > > > > the member function(s) on and see if that is a temporary, as 
> > > > > > > > in, don't
> > > > > > > > warn about
> > > > > > > > 
> > > > > > > >   const Plane & meta = fm.planes().inner();
> > > > > > > > 
> > > > > > > > but do warn about
> > > > > > > > 
> > > > > > > >   const Plane & meta = FrameMetadata().planes().inner();
> > > > > > > > 
> > > > > > > > It's ugly, but better than asking users to add #pragmas into 
> > > > > > > > their code.
> > > > > > > 
> > > > > > > Hmm, that doesn't seem right; the former is only OK because Ref 
> > > > > > > is in fact a
> > > > > > > reference-like type.  If planes() returned a class that held 
> > > > > > > data, we would
> > > > > > > want to warn.
> > > > > > 
> > > > > > Sure, it's always some kind of tradeoff with warnings :/.
> > > > > > > In this case, we might recognize the reference-like class because 
> > > > > > > it has a
> > > > > > > reference member and a constructor taking the same reference type.
> > > > > > 
> > > > > > That occurred to me too, but then I found out that 
> > > > > > std::reference_wrapper
> > > > > > actually uses T*, not T&, as you say.  But here's a patch to do that
> > > > > > (I hope).
> > > > > > > That wouldn't help with std::reference_wrapper or std::ref_view 
> > > > > > > because they
> > > > > > > have pointer members instead of references, but perhaps loosening 
> > > > > > > the check
> > > > > > > to include that case would make sense?
> > > > > > 
> > > > > > Sorry, I don't understand what you mean by loosening the check.  I 
> > > > > > could
> > > > > > hardcode std::reference_wrapper and std::ref_view but I don't think 
> > > > > > that's
> > > > > > what you meant.
> > > > > 
> > > > > Indeed that's not what I meant, but as I was saying in our meeting I 
> > > > > think
> > > > > it's worth doing; the compiler has various tweaks to handle specific
> > > > > standard-library classes better.
> > > > Okay, done in the patch below.  Except that I'm not including a test for
> > > > std::ranges::ref_view because I don't really know how that works.
> > > > 
> > > > > > Surely I cannot _not_ warn for any class that contains a T*.
> > > > > 
> > > > > I was thinking if a constructor takes a T& and the class has a T* 
> > > > > that would
> > > > > be close enough, though this also wouldn't handle the standard library
> > > > > classes so the benefit is questionable.
> > > > > 
> > > > > > Here's the patch so that we have some actual code to discuss...  
> > > > > > Thanks.
> > > > > > 
> > > > > > -- >8 --
> > > > > > Here, -Wdangling-reference triggers where it probably shouldn't, 
> > > > > > causing
> > > > > > some grief.  The code in question uses a reference wrapper with a 
> > > > > > member
> > > > > > function returning a reference to a subobject of a non-temporary 
> > > > > > object:
> > > > > > 
> > > > > >  const Plane & meta = fm.planes().inner();
> > > > > > 
> > > > > > I've tried a few approaches, e.g., checking that the member 
> > > > > > function's
> > > > > > return type is the same as the type of the enclosing class (which is
> > > > > > the case for member functions returning *this), but that then breaks
> > > > > > Wdangling-reference4.C with std::optional.
> > > > > > 
> > > > > > Perhaps we want to look at the member function's enclosing class
> > > > > > to see if it's a reference wrapper class (meaning, has a reference
> > > > > > member and a constructor taking the same reference type) and don't
> > > > > > warn if so, supposing that the member function returns a reference
> > > > > > to a non-temporary object.
> > 

Re: [committed] testsuite: Expect -Wdeprecated warning in warn/Wstrict-aliasing-bogus-union-2.C for C++23

2023-02-07 Thread Jonathan Wakely via Gcc-patches
On Tue, 7 Feb 2023 at 09:25, Jakub Jelinek  wrote:
>
> Hi!
>
> On Mon, Feb 06, 2023 at 02:26:01PM +, Jonathan Wakely via Gcc-patches 
> wrote:
> > With the recent change to deprecate std::aligned_storage and
> > std::aligned_union we need to adjust some tests that now fail with
> > -std=c++23.
>
> The g++.dg/warn/Wstrict-aliasing-bogus-union-2.C test is also affected:
> PASS: g++.dg/warn/Wstrict-aliasing-bogus-union-2.C  -std=gnu++2b  (test for 
> bogus messages, line 12)
> FAIL: g++.dg/warn/Wstrict-aliasing-bogus-union-2.C  -std=gnu++2b (test for 
> excess errors)
> Excess errors:
> .../gcc/testsuite/g++.dg/warn/Wstrict-aliasing-bogus-union-2.C:8:8: warning: 
> 'template struct 
> std::aligned_storage' is deprecated [-Wdeprecated-declarations]
>
> The following patch adds dg-warning for it.
>
> Tested on x86_64-linux -m32/-m64 with GXX_TESTSUITE_STDS=98,11,14,17,20,2b ,
> committed to trunk as obvious.

Oops, sorry about that - I must remember to check the g++ tests for
any header and deprecation changes.



OpenMP Patch Ping – including "[13 Regression]" patches

2023-02-07 Thread Tobias Burnus

Updated ping email as some patches have been reviewed. [Thanks! I still
need to revised three of them: for Fortran, the parallel + loop issue
and the non-rect-loop issue - and for C/C++, the allocate's align patch.]

I think it would be very good to get (some of) Julian's mapping patches
in as it fixes several issues; PR108624 is the latest found issue that
is fixed by his set of patches.

Smaller, more recent patches:

* "[Patch] libgomp: Fix reverse-offload for GOMP_MAP_TO_PSET"
https://gcc.gnu.org/pipermail/gcc-patches/2023-February/611429.html

* "[PATCH] openmp: Add support for 'present' modifier"
https://gcc.gnu.org/pipermail/gcc-patches/2023-February/611299.html

On 10.01.23 12:37, Tobias Burnus wrote [with FYI comments omitted]:


Fortran allocat(e,ors) prep patch
=

* [Patch] Fortran/OpenMP: Add parsing support for allocators/allocate directive
https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608904.html>  Wed Dec 
21 15:51:25 GMT 2022

Mapping related patches
===
(Complex but GCC needs a revision badly as it fixing several bugs and
missing functionality.)

* Complete patch set was just re-submitted by Julian, overiew patch is
   [PATCH v6 00/11] OpenMP: C/C++ lvalue parsing, C/C++/Fortran "declare 
mapper" support
   https://gcc.gnu.org/pipermail/gcc-patches/2022-December/thread.html#609031>  
Fri Dec 23 12:12:53 GMT 2022

* Note: For 10/11 of the set, there was a follow up this Monday
   [PATCH v6 10/11] OpenMP: Support OpenMP 5.0 "declare mapper" directives for C
   https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609566.html>

Fortran deep mapping (allocatable components)


[Patch][1/2] OpenMP: Add lang hooks + run-time filled map arrays for
Fortran deep mapping of DT
https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609637.html>


Thanks,

Tobias

PS: NOTE to the list above: I have stopped checking older patches. I know
some more are pending review, others need to be revised. I will re-check,
once the below listed patches have been reviewed. Cf. old list.

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


Re: [PATCH 1/2] Handle component_ref to a structre/union field including flexible array member [PR101832]

2023-02-07 Thread Qing Zhao via Gcc-patches



> On Feb 7, 2023, at 10:28 AM, Siddhesh Poyarekar  wrote:
> 
> On 2023-02-06 18:14, Joseph Myers wrote:
>> On Mon, 6 Feb 2023, Qing Zhao via Gcc-patches wrote:
>>> In GCC14:
>>> 
>>> 1. Include this new warning -Wgnu-varaible-sized-type-not-at-end to -Wall
>>> 2. Deprecate this extension from GCC. (Or delay this to next release?).
>> Any deprecation, or inclusion in -Wall, would best come with evidence
>> about the prevalance of use (possibly unintentional, probably undesirable)
>> of these extensions.  For example, maybe someone could do a distribution
>> rebuild with a patch to enable these warnings and report the results?
> 
> FWIW, Fred from our team has been working on a mass prebuilder that can be 
> used for this kind of distribution-wide validation.  I've used it for 
> _FORTIFY_SOURCE validation as well as coverage analysis.
> 
> I can help you with this Qing, once you have a patch ready.

Thanks a lot!

Qing
> 
> Sid
> 
> [1] https://gitlab.com/fedora/packager-tools/mass-prebuild/



Re: [PATCH 1/2] Handle component_ref to a structre/union field including flexible array member [PR101832]

2023-02-07 Thread Siddhesh Poyarekar

On 2023-02-06 18:14, Joseph Myers wrote:

On Mon, 6 Feb 2023, Qing Zhao via Gcc-patches wrote:


In GCC14:

1. Include this new warning -Wgnu-varaible-sized-type-not-at-end to -Wall
2. Deprecate this extension from GCC. (Or delay this to next release?).


Any deprecation, or inclusion in -Wall, would best come with evidence
about the prevalance of use (possibly unintentional, probably undesirable)
of these extensions.  For example, maybe someone could do a distribution
rebuild with a patch to enable these warnings and report the results?


FWIW, Fred from our team has been working on a mass prebuilder that can 
be used for this kind of distribution-wide validation.  I've used it for 
_FORTIFY_SOURCE validation as well as coverage analysis.


I can help you with this Qing, once you have a patch ready.

Sid

[1] https://gitlab.com/fedora/packager-tools/mass-prebuild/


Re: [PATCH 1/2] Handle component_ref to a structre/union field including flexible array member [PR101832]

2023-02-07 Thread Qing Zhao via Gcc-patches
Hi, Joseph,


> On Feb 6, 2023, at 6:14 PM, Joseph Myers  wrote:
> 
> On Mon, 6 Feb 2023, Qing Zhao via Gcc-patches wrote:
> 
>> In GCC14:
>> 
>> 1. Include this new warning -Wgnu-varaible-sized-type-not-at-end to -Wall
>> 2. Deprecate this extension from GCC. (Or delay this to next release?).
> 
> Any deprecation, or inclusion in -Wall, would best come with evidence 
> about the prevalance of use (possibly unintentional, probably undesirable) 
> of these extensions.  For example, maybe someone could do a distribution 
> rebuild with a patch to enable these warnings and report the results?
Yes, before we deprecate this extension, it’s better to make sure all such
 misuses are updated already.
> 
> Various misuses of flexible array members are only pedwarns-if-pedantic 
> because of such uses - and while the original motivating case 
> 

Just checked this patch (which has been in GCC source tree already), the 
routine flexible_array_type_p 
+/* Determine whether TYPE is a structure with a flexible array member,
+   or a union containing such a structure (possibly recursively).  */
+
+static bool
+flexible_array_type_p (type)

Did not include the following cases:

1.  Structure with flexible array member embedded into other structures 
recursively, for example:

struct A {
  int n;
  char data[];
};

struct B {
  int m;
  struct A a;
};

struct C {
  int q;
  struct B b;
};

In the above, “struct C” will not be caught by this routine.

Shall “struct C” be included?

2. Only C99 standard flexible array member be included, [0] and [1] are not 
included, for example:

struct A {
  int n;
  char data[0];
};

struct B {
  int m;
  struct A a;
};

struct C {
  int q;
  struct B b;
};

In the above, “struct B” and “struct C” will not be caught by this routine.

Shall “the above struct B” and “struct C” be included per -fstrict-flex-arrays?

Could you please take a look at my latest patch:

https://gcc.gnu.org/pipermail/gcc-patches/2023-February/611445.html

To see whether the new bit “TYPE_INCLUDE_FLEXARRAY” covers the above 
“flexible_array_type_p”? Then we can merge them together?


> was 
> _G_config.h, which has since been fixed (though existing installed headers 
> from old glibc would need fixincluding, at least if it becomes an error), 
> it's very plausible there are uses of these extensions elsewhere.
If this is the case, we should definitely deprecate this extension.

Thanks.

Qing
> 
> -- 
> Joseph S. Myers
> jos...@codesourcery.com



Ping^4: [PATCH] jit: Install jit headers in $(libsubincludedir) [PR 101491]

2023-02-07 Thread Lorenzo Salvadore via Gcc-patches
Hello,

Ping https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606450.html

Thanks,

Lorenzo Salvadore

> From f8e2c2ee89a7d8741bb65163d1f1c20edcd546ac Mon Sep 17 00:00:00 2001
> From: Lorenzo Salvadore develo...@lorenzosalvadore.it
> 
> Date: Wed, 16 Nov 2022 11:27:38 +0100
> Subject: [PATCH] jit: Install jit headers in $(libsubincludedir) [PR 101491]
> 
> Installing jit/libgccjit.h and jit/libgccjit++.h headers in
> $(includedir) can be a problem for machines where multiple versions of
> GCC are required simultaneously, see for example this bug report on
> FreeBSD:
> 
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=257060
> 
> Hence,
> 
> - define $(libsubincludedir) the same way it is defined in libgomp;
> - install jit/libgccjit.h and jit/libgccjit++.h in $(libsubincludedir).
> 
> The patch has already been applied successfully in the official FreeBSD
> ports tree for the ports lang/gcc11 and lang/gcc12. Please see the
> following commits:
> 
> https://cgit.freebsd.org/ports/commit/?id=0338e04504ee269b7a95e6707f1314bc1c4239fe
> https://cgit.freebsd.org/ports/commit/?id=f1957296ed2dce8a09bb9582e9a5a715bf8b3d4d
> 
> gcc/ChangeLog:
> 
> 2022-11-16 Lorenzo Salvadore develo...@lorenzosalvadore.it
> 
> PR jit/101491
> * Makefile.in: Define and create $(libsubincludedir)
> 
> gcc/jit/ChangeLog:
> 
> 2022-11-16 Lorenzo Salvadore develo...@lorenzosalvadore.it
> 
> PR jit/101491
> * Make-lang.in: Install headers in $(libsubincludedir)
> ---
> gcc/Makefile.in | 3 +++
> gcc/jit/Make-lang.in | 4 ++--
> 2 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index f672e6ea549..3bcf1c491ab 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -635,6 +635,8 @@ libexecdir = @libexecdir@
> 
> # Directory in which the compiler finds libraries etc.
> libsubdir = 
> $(libdir)/gcc/$(real_target_noncanonical)/$(version)$(accel_dir_suffix)
> +# Directory in which the compiler finds headers.
> +libsubincludedir = $(libdir)/gcc/$(target_alias)/$(version)/include
> # Directory in which the compiler finds executables
> libexecsubdir = 
> $(libexecdir)/gcc/$(real_target_noncanonical)/$(version)$(accel_dir_suffix)
> # Directory in which all plugin resources are installed
> @@ -3642,6 +3644,7 @@ install-cpp: installdirs cpp$(exeext)
> # $(libdir)/gcc/include isn't currently searched by cpp.
> installdirs:
> $(mkinstalldirs) $(DESTDIR)$(libsubdir)
> + $(mkinstalldirs) $(DESTDIR)$(libsubincludedir)
> $(mkinstalldirs) $(DESTDIR)$(libexecsubdir)
> $(mkinstalldirs) $(DESTDIR)$(bindir)
> $(mkinstalldirs) $(DESTDIR)$(includedir)
> diff --git a/gcc/jit/Make-lang.in b/gcc/jit/Make-lang.in
> index 248ec45b729..ba1b3e95da5 100644
> --- a/gcc/jit/Make-lang.in
> +++ b/gcc/jit/Make-lang.in
> @@ -360,9 +360,9 @@ selftest-jit:
> # Install hooks:
> jit.install-headers: installdirs
> $(INSTALL_DATA) $(srcdir)/jit/libgccjit.h \
> - $(DESTDIR)$(includedir)/libgccjit.h
> + $(DESTDIR)$(libsubincludedir)/libgccjit.h
> $(INSTALL_DATA) $(srcdir)/jit/libgccjit++.h \
> - $(DESTDIR)$(includedir)/libgccjit++.h
> + $(DESTDIR)$(libsubincludedir)/libgccjit++.h
> 
> ifneq (,$(findstring mingw,$(target)))
> jit.install-common: installdirs jit.install-headers
> --
> 2.38.0


Re: [PATCH 0/2] Repost of patches for solving the build on Fedora 36 problem

2023-02-07 Thread Jakub Jelinek via Gcc-patches
On Tue, Feb 07, 2023 at 03:22:41PM +0100, Richard Biener via Gcc-patches wrote:
> > He did not specify --with=cpu= or --with-tune=, which means he got
> > power8 defaults for both of those.  It's hard for me to believe that
> > --with-tune=power9 could hide the issue, but we'll try that configuration
> > too.  Do you have any other configure options that might affect things?
> 
> The full configure is
> 
> ../configure --prefix=/usr --infodir=/usr/share/info
> --mandir=/usr/share/man --libdir=/usr/lib64 --libexecdir=/usr/lib64
> --enable-languages=c,c++,objc,fortran,obj-c++,ada,go,jit,m2
> --enable-host-shared --enable-checking=release --disable-werror
> --with-gxx-include-dir=/usr/include/c++/13
> --with-libstdcxx-zoneinfo=/usr/share/zoneinfo --enable-ssp
> --disable-libssp --disable-libvtv --enable-cet=auto --disable-libcc1
> --enable-plugin --with-bugurl=https://bugs.opensuse.org/
> '--with-pkgversion=SUSE Linux' --with-slibdir=/lib64
> --with-system-zlib --enable-libstdcxx-allocator=new
> --disable-libstdcxx-pch --enable-version-specific-runtime-libs
> --with-gcc-major-version-only --enable-linker-build-id
> --enable-linux-futex --enable-gnu-indirect-function
> --program-suffix=-13 --without-system-libunwind --with-cpu=power8
> --with-tune=power9 --with-long-double-format=ieee --enable-secureplt
> --with-long-double-128 --enable-targets=powerpcle-linux
> --disable-multilib --with-build-config=bootstrap-lto-lean
> --enable-link-mutex --build=powerpc64le-suse-linux
> --host=powerpc64le-suse-linux
> 
> and _mulkc3.c is built like

We do not get the ICE in Fedora builds either.  The important thing is
--enable-checking=release
With release checking there is no ICE, with checking there is one.

Jakub



Re: [PATCH 0/2] Repost of patches for solving the build on Fedora 36 problem

2023-02-07 Thread Richard Biener via Gcc-patches
On Mon, Feb 6, 2023 at 7:28 PM Peter Bergner  wrote:
>
> On 2/3/23 1:42 AM, Richard Biener wrote:
> > On Fri, Feb 3, 2023 at 6:44 AM Michael Meissner via Gcc-patches
> >  wrote:
> >>
> >> I'm reposting these two patches that allow GCC to build on Fedora 36 just 
> >> to be
> >> clear which patches I'm talking about.  The issue is that if GCC is 
> >> configured
> >> with long double using the IEEE 128-bit representation, it currently cannot
> >> build _mulkc3 and _divkc3 in libgcc.
> >
> > It's interesting that we do not see this with openSUSE where I configure 
> > with
> >
> > --with-cpu=power8 --with-tune=power9 --with-long-double-format=ieee
> > --with-long-double-128
> >
> > note this is ppc64le, we leave ppc64 and ppc with their default.
>
> That's strange, Bill just retested on our ppc64le openSUSE Tumbleweed system
> using basically the same configure options and sees the ICE:
>
> /home/seurer/gcc/git/gcc-trunk/libgcc/config/rs6000/_mulkc3.c: In function 
> '__mulkc3_sw':
> /home/seurer/gcc/git/gcc-trunk/libgcc/config/rs6000/_mulkc3.c:97:1: internal 
> compiler error: in fold_stmt, at gimple-range-fold.cc:522
>
> He did not specify --with=cpu= or --with-tune=, which means he got
> power8 defaults for both of those.  It's hard for me to believe that
> --with-tune=power9 could hide the issue, but we'll try that configuration
> too.  Do you have any other configure options that might affect things?

The full configure is

../configure --prefix=/usr --infodir=/usr/share/info
--mandir=/usr/share/man --libdir=/usr/lib64 --libexecdir=/usr/lib64
--enable-languages=c,c++,objc,fortran,obj-c++,ada,go,jit,m2
--enable-host-shared --enable-checking=release --disable-werror
--with-gxx-include-dir=/usr/include/c++/13
--with-libstdcxx-zoneinfo=/usr/share/zoneinfo --enable-ssp
--disable-libssp --disable-libvtv --enable-cet=auto --disable-libcc1
--enable-plugin --with-bugurl=https://bugs.opensuse.org/
'--with-pkgversion=SUSE Linux' --with-slibdir=/lib64
--with-system-zlib --enable-libstdcxx-allocator=new
--disable-libstdcxx-pch --enable-version-specific-runtime-libs
--with-gcc-major-version-only --enable-linker-build-id
--enable-linux-futex --enable-gnu-indirect-function
--program-suffix=-13 --without-system-libunwind --with-cpu=power8
--with-tune=power9 --with-long-double-format=ieee --enable-secureplt
--with-long-double-128 --enable-targets=powerpcle-linux
--disable-multilib --with-build-config=bootstrap-lto-lean
--enable-link-mutex --build=powerpc64le-suse-linux
--host=powerpc64le-suse-linux

and _mulkc3.c is built like

/home/abuild/rpmbuild/BUILD/gcc-13.0.1+git5679/obj-powerpc64le-suse-linux/./gcc/xgcc
-B/home/abuild/rpmbuild/BUILD/gcc-13.0.1+git5679/obj-powerpc64le-suse-linux/./gcc/
-B/usr/powerpc64le-suse-linux/bin/ -B/usr/powerpc64le-suse-linux/lib/
-isystem /usr/powerpc64le-suse-linux/include -isystem
/usr/powerpc64le-suse-linux/sys-include   -fno-checking -O2
-U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=3 -funwind-tables
-fasynchronous-unwind-tables -fstack-clash-protection
-Werror=return-type -g -U_FORTIFY_SOURCE -O2  -O2 -U_FORTIFY_SOURCE
-D_FORTIFY_SOURCE=3 -funwind-tables -fasynchronous-unwind-tables
-fstack-clash-protection -Werror=return-type -g -U_FORTIFY_SOURCE
-DIN_GCC -fPIC   -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual
-Wno-format -Wstrict-prototypes -Wmissing-prototypes
-Wold-style-definition  -isystem ./include  -fPIC -mlong-double-128
-mno-minimal-toc -g -DIN_LIBGCC2 -fbuilding-libgcc
-fno-stack-protector  -fPIC -mlong-double-128 -mno-minimal-toc -I. -I.
-I../.././gcc -I../../../libgcc -I../../../libgcc/.
-I../../../libgcc/../gcc -I../../../libgcc/../include
-I../../../libgcc/../libdecnumber/dpd
-I../../../libgcc/../libdecnumber -DHAVE_CC_TLS  -Wno-type-limits
-mvsx -mfloat128 -mno-float128-hardware -mno-gnu-attribute
-I../../../libgcc/soft-fp -I../../../libgcc/config/rs6000
-DFLOAT128_HW_INSNS -DFLOAT128_HW_INSNS_ISA3_1  -o _mulkc3.o -MT
_mulkc3.o -MD -MP -MF _mulkc3.dep  -c
../../../libgcc/config/rs6000/_mulkc3.c -fvisibility=hidden
-DHIDE_EXPORTS

>
> Peter
>
>


Re: [PATCH] Print padding size when aligning struct member

2023-02-07 Thread Vit Kabele
On Tue, Feb 07, 2023 at 02:08:35PM +0100, Andreas Schwab wrote:
> On Feb 07 2023, Vít Kabele wrote:
> 
> > diff --git a/gcc/testsuite/gcc.dg/Wpadded.c b/gcc/testsuite/gcc.dg/Wpadded.c
> > index 70fcd79a6d4..357e7f61e4a 100644
> > --- a/gcc/testsuite/gcc.dg/Wpadded.c
> > +++ b/gcc/testsuite/gcc.dg/Wpadded.c
> > @@ -10,5 +10,5 @@
> >  
> >  struct foo {
> >char bar;
> > -  long baz; /* { dg-warning "padding struct to align" ""  { target { ! 
> > default_packed } } } */
> > +  long baz; /* { dg-warning "padding struct with 7 bytes to align" ""  { 
> > target { ! default_packed } } } */
> 
> The actual amount of padding is target dependent.
Ah sorry, would removing everything after `with...` be ok?

Like this:
> +  long baz; /* { dg-warning "padding struct with" ""  { target { ! 
> default_packed } } } */

-- 
Best regards,
Vit Kabele


[pushed] [PR103541] RA: Implement reuse of equivalent memory for caller saves optimization

2023-02-07 Thread Vladimir Makarov via Gcc-patches

The following patch solves

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103541

The patch was successfully bootstrapped and tested on x86-64, aarch64, 
and ppc64le.
commit f661c0bb6371f355966a67b5ce71398e80792948
Author: Vladimir N. Makarov 
Date:   Tue Feb 7 08:27:36 2023 -0500

RA: Implement reuse of equivalent memory for caller saves optimization

The test case shows opportunity to reuse memory with constant address for
caller saves optimization for constant or pure function call.  The patch
implements the memory reuse.

PR rtl-optimization/103541

gcc/ChangeLog:

* ira.h (struct ira_reg_equiv_s): Add new field caller_save_p.
* ira.cc (validate_equiv_mem): Check memref address variance.
(update_equiv_regs): Define caller save equivalence for
valid_combine.
(setup_reg_equiv): Clear defined_p flag for caller save equivalence.
* lra-constraints.cc (lra_copy_reg_equiv): Add new arg
call_save_p.  Use caller save equivalence depending on the arg.
(split_reg): Adjust the call.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr103541.c: New.

diff --git a/gcc/ira.cc b/gcc/ira.cc
index 66df03e8a59..c6ee46286bc 100644
--- a/gcc/ira.cc
+++ b/gcc/ira.cc
@@ -3070,6 +3070,8 @@ validate_equiv_mem_from_store (rtx dest, const_rtx set ATTRIBUTE_UNUSED,
 info->equiv_mem_modified = true;
 }
 
+static int equiv_init_varies_p (rtx x);
+
 enum valid_equiv { valid_none, valid_combine, valid_reload };
 
 /* Verify that no store between START and the death of REG invalidates
@@ -3113,7 +3115,8 @@ validate_equiv_mem (rtx_insn *start, rtx reg, rtx memref)
 	 been changed and all hell breaks loose.  */
 	  ret = valid_combine;
 	  if (!MEM_READONLY_P (memref)
-	  && !RTL_CONST_OR_PURE_CALL_P (insn))
+	  && (!RTL_CONST_OR_PURE_CALL_P (insn)
+		  || equiv_init_varies_p (XEXP (memref, 0
 	return valid_none;
 	}
 
@@ -3766,7 +3769,18 @@ update_equiv_regs (void)
 		{
 		  replacement = copy_rtx (SET_SRC (set));
 		  if (validity == valid_reload)
-		note = set_unique_reg_note (insn, REG_EQUIV, replacement);
+		{
+		  note = set_unique_reg_note (insn, REG_EQUIV, replacement);
+		}
+		  else
+		{
+		  /* We still can use this equivalence for caller save
+			 optimization in LRA.  Mark this.  */
+		  ira_reg_equiv[regno].caller_save_p = true;
+		  ira_reg_equiv[regno].init_insns
+			= gen_rtx_INSN_LIST (VOIDmode, insn,
+	 ira_reg_equiv[regno].init_insns);
+		}
 		}
 	}
 
@@ -4156,7 +4170,7 @@ setup_reg_equiv (void)
 		   legitimate, we ignore such REG_EQUIV notes.  */
 		if (memory_operand (x, VOIDmode))
 		  {
-		ira_reg_equiv[i].defined_p = true;
+		ira_reg_equiv[i].defined_p = !ira_reg_equiv[i].caller_save_p;
 		ira_reg_equiv[i].memory = x;
 		continue;
 		  }
diff --git a/gcc/ira.h b/gcc/ira.h
index 58b50dbe8a2..3d35025a46e 100644
--- a/gcc/ira.h
+++ b/gcc/ira.h
@@ -175,8 +175,11 @@ extern struct target_ira *this_target_ira;
 /* Major structure describing equivalence info for a pseudo.  */
 struct ira_reg_equiv_s
 {
-  /* True if we can use this equivalence.  */
+  /* True if we can use this as a general equivalence.  */
   bool defined_p;
+  /* True if we can use this equivalence only for caller save/restore
+ location.  */
+  bool caller_save_p;
   /* True if the usage of the equivalence is profitable.  */
   bool profitable_p;
   /* Equiv. memory, constant, invariant, and initializing insns of
diff --git a/gcc/lra-constraints.cc b/gcc/lra-constraints.cc
index 7bffbc07ee2..dd4f68bbfc0 100644
--- a/gcc/lra-constraints.cc
+++ b/gcc/lra-constraints.cc
@@ -5771,14 +5771,17 @@ choose_split_class (enum reg_class allocno_class,
   return best_cl;
 }
 
-/* Copy any equivalence information from ORIGINAL_REGNO to NEW_REGNO.
-   It only makes sense to call this function if NEW_REGNO is always
-   equal to ORIGINAL_REGNO.  */
+/* Copy any equivalence information from ORIGINAL_REGNO to NEW_REGNO.  It only
+   makes sense to call this function if NEW_REGNO is always equal to
+   ORIGINAL_REGNO.  Set up defined_p flag when caller_save_p flag is set up and
+   CALL_SAVE_P is true.  */
 
 static void
-lra_copy_reg_equiv (unsigned int new_regno, unsigned int original_regno)
+lra_copy_reg_equiv (unsigned int new_regno, unsigned int original_regno,
+		bool call_save_p)
 {
-  if (!ira_reg_equiv[original_regno].defined_p)
+  if (!ira_reg_equiv[original_regno].defined_p
+  && !(call_save_p && ira_reg_equiv[original_regno].caller_save_p))
 return;
 
   ira_expand_reg_equiv ();
@@ -5958,7 +5961,7 @@ split_reg (bool before_p, int original_regno, rtx_insn *insn,
  rematerializing the original value instead of spilling to the stack.  */
   if (!HARD_REGISTER_NUM_P (original_regno)
   && mode == PSEUDO_REGNO_MODE (original_regno))
-lra_copy_reg_equiv (new_regno, original_regno);
+

[PATCH] tree-optimization/26854 - slow bitmap operations

2023-02-07 Thread Richard Biener via Gcc-patches
With the compiler.i testcase from the PR one can see bitmap_set_bit
very high in the profile, originating from SSA update and alias
stmt walking.  For SSA update mark_block_for_update essentially
performs redundant bitmap_set_bits and is called via
insert_updated_phi_nodes_for as

  EXECUTE_IF_SET_IN_BITMAP (pruned_idf, 0, i, bi)
...
  mark_block_for_update (bb);
  FOR_EACH_EDGE (e, ei, bb->preds)
if (e->src->index >= 0)
  mark_block_for_update (e->src);

which is quite random in the access pattern and runs into the
O(n) case of the linked list bitmap representation.  Switching
blocks_to_update to tree view around insert_updated_phi_nodes_for
improves SSA update time from

 tree SSA incremental   :   4.26 (  3%)

to

 tree SSA incremental   :   2.98 (  2%)

Likewise the visited bitmap allocated by the alias walker benefits
from using the tree view in case of large CFGs and we see an
improvement from

 alias stmt walking :  10.53 (  9%)

to

 alias stmt walking :   4.05 (  4%)


Bootstrap and regtest running on x86_64-unknown-linux-gnu.

PR tree-optimization/26854
* tree-into-ssa.cc (update_ssa): Turn blocks_to_update to tree
view around insert_updated_phi_nodes_for.
* tree-ssa-alias.cc (maybe_skip_until): Allocate visited bitmap
in tree view.
(walk_aliased_vdefs_1): Likewise.
---
 gcc/tree-into-ssa.cc  |  4 
 gcc/tree-ssa-alias.cc | 10 --
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/gcc/tree-into-ssa.cc b/gcc/tree-into-ssa.cc
index 067d29698f9..2e322990456 100644
--- a/gcc/tree-into-ssa.cc
+++ b/gcc/tree-into-ssa.cc
@@ -3561,6 +3561,8 @@ update_ssa (unsigned update_flags)
bitmap_initialize ([bb->index], _default_obstack);
   compute_dominance_frontiers (dfs);
 
+  bitmap_tree_view (blocks_to_update);
+
   /* insert_update_phi_nodes_for will call add_new_name_mapping
 when inserting new PHI nodes, but it will not add any
 new members to OLD_SSA_NAMES.  */
@@ -3574,6 +3576,8 @@ update_ssa (unsigned update_flags)
   FOR_EACH_VEC_ELT (symbols_to_rename, i, sym)
insert_updated_phi_nodes_for (sym, dfs, update_flags);
 
+  bitmap_list_view (blocks_to_update);
+
   FOR_EACH_BB_FN (bb, cfun)
bitmap_clear ([bb->index]);
   free (dfs);
diff --git a/gcc/tree-ssa-alias.cc b/gcc/tree-ssa-alias.cc
index 7089e8b5bf3..81bc51ed4ad 100644
--- a/gcc/tree-ssa-alias.cc
+++ b/gcc/tree-ssa-alias.cc
@@ -3657,7 +3657,10 @@ maybe_skip_until (gimple *phi, tree , basic_block 
target_bb,
   basic_block bb = gimple_bb (phi);
 
   if (!*visited)
-*visited = BITMAP_ALLOC (NULL);
+{
+  *visited = BITMAP_ALLOC (NULL);
+  bitmap_tree_view (*visited);
+}
 
   bitmap_set_bit (*visited, SSA_NAME_VERSION (PHI_RESULT (phi)));
 
@@ -3949,7 +3952,10 @@ walk_aliased_vdefs_1 (ao_ref *ref, tree vdef,
{
  unsigned i;
  if (!*visited)
-   *visited = BITMAP_ALLOC (NULL);
+   {
+ *visited = BITMAP_ALLOC (NULL);
+ bitmap_tree_view (*visited);
+   }
  for (i = 0; i < gimple_phi_num_args (def_stmt); ++i)
{
  int res = walk_aliased_vdefs_1 (ref,
-- 
2.35.3


[PATCH] tree-optimization/26854 - compile-time hog in SSA forwprop

2023-02-07 Thread Richard Biener via Gcc-patches
The following addresses

 tree forward propagate :  12.41 (  9%)

seen with the compile.i testcase of this PR which points at
the has_use_on_stmt function which, for SSA names with many
uses is slow.  The solution is to instead of immediate uses,
look at stmt operands to identify whether a name has a use
on a stmt.  That improves SSA forwprop to

 tree forward propagate :   1.30 (  0%)

for this testcase.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/26854
* gimple-fold.cc (has_use_on_stmt): Look at stmt operands
instead of immediate uses.
---
 gcc/gimple-fold.cc | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
index 379d2e930ea..935e8006413 100644
--- a/gcc/gimple-fold.cc
+++ b/gcc/gimple-fold.cc
@@ -5767,15 +5767,17 @@ gimple_fold_call (gimple_stmt_iterator *gsi, bool 
inplace)
 }
 
 
-/* Return true whether NAME has a use on STMT.  */
+/* Return true whether NAME has a use on STMT.  Note this can return
+   false even though there's a use on STMT if SSA operands are not
+   up-to-date.  */
 
 static bool
 has_use_on_stmt (tree name, gimple *stmt)
 {
-  imm_use_iterator iter;
-  use_operand_p use_p;
-  FOR_EACH_IMM_USE_FAST (use_p, iter, name)
-if (USE_STMT (use_p) == stmt)
+  ssa_op_iter iter;
+  tree op;
+  FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_OP_USE)
+if (op == name)
   return true;
   return false;
 }
-- 
2.35.3


[PATCH] rs6000: Add new patterns rlwinm with mask

2023-02-07 Thread Jiufu Guo via Gcc-patches
Hi,

For code:
```
u64
test_rlwinm_lowpart_mask (u32 v)
{
  u32 v1 = ((v << N) | (v >> (32 - N))) & 0xf00;
  return (u64)v1;
}
```
We generate "rlwinm 3,3,4,4,23; rldicl 3,3,0,32" instead of "rlwinm 3,3,4,4,23".
Here the "rlwinm" cleans high32 bits already, so "rldicl" is reductant.

Similarly, for the below code which is the functionality of "rlwinm".
```
u64
test_rlwinm_mask (u32 v)
{
  u32 v1 = ((v << N) | (v >> (32 - N)));
  u64 v2 = (u64) v1 | ((u64) v1 << 32);
  return v2 & 0xe003ULL;
}
```
We generate
"rotlwi 3,3,4; sldi 9,3,32; add 3,9,3; rldicl 3,3,35,27; rldicl 3,3,29,0"
instead of "rlwinm 3,3,4,30,2".

This patch optimizes these two kinds of code to use just one "rlwinm" insn.

Bootstrap and regtests pass on ppc64{,le}.
Is this patch ok for trunk (or next stage1)?


BR,
Jeff (Jiufu)


gcc/ChangeLog:

* config/rs6000/predicates.md (lowpart_subreg_operand): New
define_predicate.
* config/rs6000/rs6000.md (rlwinm_lowpart_mask): New define_insn.
(rlwinm_mask_): New define_insn.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/rlwinm-0.c: Reduce instruction number.
* gcc.target/powerpc/rlwinm_3.c: New test.

---
 gcc/config/rs6000/predicates.md | 18 
 gcc/config/rs6000/rs6000.md | 34 +++
 gcc/testsuite/gcc.target/powerpc/rlwinm-0.c |  6 +--
 gcc/testsuite/gcc.target/powerpc/rlwinm_3.c | 47 +
 4 files changed, 102 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/rlwinm_3.c

diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index 52c65534e51..3cc51d797c6 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -1290,6 +1290,24 @@ (define_predicate "mma_disassemble_output_operand"
   return vsx_register_operand (op, mode);
 })
 
+;; Return 1 if this operand is lowpart subreg
+(define_predicate "lowpart_subreg_operand"
+  (match_code "subreg,reg")
+{
+  if (REG_P (op))
+return 1;
+
+  rtx inner_reg = SUBREG_REG (op);
+  if (!REG_P (inner_reg))
+return 0;
+
+  machine_mode inner_mode = GET_MODE (inner_reg);
+  if (SUBREG_BYTE (op) != subreg_lowpart_offset (mode, inner_mode))
+return 0;
+
+  return 1;
+})
+
 ;; Return true if operand is an operator used in rotate-and-mask instructions.
 (define_predicate "rotate_mask_operator"
   (match_code "rotate,ashift,lshiftrt"))
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 4a7812fa592..a5d2b6ea000 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -4325,6 +4325,40 @@ (define_insn "*rotldi3_insert_7"
   [(set_attr "type" "insert")
(set_attr "size" "64")])
 
+(define_insn "rlwinm_lowpart_mask"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
+ (and:DI
+   (subreg:DI
+ (match_operator:SI 4 "rotate_mask_operator"
+   [(match_operand:SI 1 "lowpart_subreg_operand" "r")
+(match_operand:SI 2 "const_int_operand" "n")]) 0)
+   (match_operand:DI 3 "const_int_operand" "n")))]
+  "TARGET_POWERPC64 && (UINTVAL (operands[3]) >> 32) == 0
+   && rs6000_is_valid_shift_mask (operands[3], operands[4], SImode)"
+{
+  return rs6000_insn_for_shift_mask (SImode, operands, false);
+}
+  [(set_attr "type" "shift")])
+
+(define_insn "rlwinm_mask_"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
+ (and:DI
+   (plus_ior_xor:DI
+(ashift:DI
+  (subreg:DI
+(match_operator:SI 4 "rotate_mask_operator"
+  [(match_operand:SI 1 "lowpart_subreg_operand" "r")
+   (match_operand:SI 2 "const_int_operand" "n")]) 0)
+  (const_int 32))
+(zero_extend:DI (match_dup 4)))
+   (match_operand:DI 3 "const_int_operand" "n")))]
+  "TARGET_POWERPC64
+   && (UINTVAL (operands[3]) & 0x8001ULL) == 0x8001ULL
+   && rs6000_is_valid_mask (operands[3], NULL, NULL, SImode)"
+{
+  return rs6000_insn_for_shift_mask (SImode, operands, false);
+}
+  [(set_attr "type" "shift")])
 
 ; This handles the important case of multiple-precision shifts.  There is
 ; no canonicalization rule for ASHIFT vs. LSHIFTRT, so two patterns.
diff --git a/gcc/testsuite/gcc.target/powerpc/rlwinm-0.c 
b/gcc/testsuite/gcc.target/powerpc/rlwinm-0.c
index 4f4fca2d8ef..50ff01e1925 100644
--- a/gcc/testsuite/gcc.target/powerpc/rlwinm-0.c
+++ b/gcc/testsuite/gcc.target/powerpc/rlwinm-0.c
@@ -2,12 +2,12 @@
 /* { dg-options "-O2" } */
 
 /* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 6739 { target ilp32 } } } 
*/
-/* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 9716 { target lp64 } } } 
*/
+/* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 8164 { target lp64 } } } 
*/
 /* { dg-final { scan-assembler-times {(?n)^\s+blr} 3375 } } */
-/* { dg-final { scan-assembler-times {(?n)^\s+rldicl} 3081 { target lp64 } } } 
*/
+/* { dg-final { scan-assembler-times {(?n)^\s+rldicl} 1538 { target lp64 } } } 
*/
 
 /* { 

Re: [PATCH] Print padding size when aligning struct member

2023-02-07 Thread Andreas Schwab via Gcc-patches
On Feb 07 2023, Vít Kabele wrote:

> diff --git a/gcc/testsuite/gcc.dg/Wpadded.c b/gcc/testsuite/gcc.dg/Wpadded.c
> index 70fcd79a6d4..357e7f61e4a 100644
> --- a/gcc/testsuite/gcc.dg/Wpadded.c
> +++ b/gcc/testsuite/gcc.dg/Wpadded.c
> @@ -10,5 +10,5 @@
>  
>  struct foo {
>char bar;
> -  long baz; /* { dg-warning "padding struct to align" ""  { target { ! 
> default_packed } } } */
> +  long baz; /* { dg-warning "padding struct with 7 bytes to align" ""  { 
> target { ! default_packed } } } */

The actual amount of padding is target dependent.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


[PATCH] Print padding size when aligning struct member

2023-02-07 Thread Vít Kabele
Announce the size of introduced padding when compiling
with -Wpadded.

gcc/ChangeLog:

* stor-layout.cc (place_field): Change warning message format

gcc/testsuite/ChangeLog:

* c-c++-common/Wpadded.c: Add new testcase
* gcc.dg/Wpadded.c: Update the expected warning message

Signed-off-by: Vít Kabele 
---
 gcc/stor-layout.cc   | 2 +-
 gcc/testsuite/c-c++-common/Wpadded.c | 1 +
 gcc/testsuite/gcc.dg/Wpadded.c   | 2 +-
 3 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/stor-layout.cc b/gcc/stor-layout.cc
index 45bf2d18639..fb1948b94af 100644
--- a/gcc/stor-layout.cc
+++ b/gcc/stor-layout.cc
@@ -1346,7 +1346,7 @@ place_field (record_layout_info rli, tree field)
   if (!targetm.ms_bitfield_layout_p (rli->t)
  && DECL_SOURCE_LOCATION (field) != BUILTINS_LOCATION
  && !TYPE_ARTIFICIAL (rli->t))
-   warning (OPT_Wpadded, "padding struct to align %q+D", field);
+   warning (OPT_Wpadded, "padding struct with %d bytes to align 
%q+D", (desired_align - known_align)/8, field);
 
   /* If the alignment is still within offset_align, just align
 the bit position.  */
diff --git a/gcc/testsuite/c-c++-common/Wpadded.c 
b/gcc/testsuite/c-c++-common/Wpadded.c
index c5be4686822..35ff013026f 100644
--- a/gcc/testsuite/c-c++-common/Wpadded.c
+++ b/gcc/testsuite/c-c++-common/Wpadded.c
@@ -11,4 +11,5 @@
  * 4 byte fields to 2 byte boundary.
  */
 struct S { __UINT32_TYPE__ i; char c; } __attribute__((aligned(4))); /* { 
dg-warning "padding struct size to alignment boundary with 3 bytes" } */
+struct R { char c; __UINT32_TYPE__ i; } __attribute__((aligned(4))); /* { 
dg-warning "padding struct with 3 bytes to align 'i'" } */
 
diff --git a/gcc/testsuite/gcc.dg/Wpadded.c b/gcc/testsuite/gcc.dg/Wpadded.c
index 70fcd79a6d4..357e7f61e4a 100644
--- a/gcc/testsuite/gcc.dg/Wpadded.c
+++ b/gcc/testsuite/gcc.dg/Wpadded.c
@@ -10,5 +10,5 @@
 
 struct foo {
   char bar;
-  long baz; /* { dg-warning "padding struct to align" ""  { target { ! 
default_packed } } } */
+  long baz; /* { dg-warning "padding struct with 7 bytes to align" ""  { 
target { ! default_packed } } } */
 } futz;
-- 
2.30.2



[PATCH] lra: Replace subregs in bare uses & clobbers [PR108681]

2023-02-07 Thread Richard Sandiford via Gcc-patches
In this PR we had a write to one vector of a 4-vector tuple.
The vector had mode V1DI, and the target doesn't provide V1DI
moves, so this was converted into:

(clobber (subreg:V1DI (reg/v:V4x1DI 92 [ b ]) 24))

followed by a DImode move.  (The clobber isn't really necessary
or helpful for a single word, but would be for wider moves.)

The subreg in the clobber survived until after RA:

(clobber (subreg:V1DI (reg/v:V4x1DI 34 v2 [orig:92 b ] [92]) 24))

IMO this isn't well-formed.  If a subreg of a hard register simplifies
to a hard register, it should be replaced by the hard register.  If the
subreg doesn't simplify, then target-independent code can't be sure
which parts of the register are affected and which aren't.  A clobber
of such a subreg isn't useful and (again IMO) should just be removed.
Conversely, a use of such a subreg is effectively a use of the whole
inner register.

LRA has code to simplify subregs of hard registers, but it didn't
handle bare uses and clobbers.  The patch extends it to do that.

One question was whether the final_p argument to alter_subregs
should be true or false.  True is IMO dangerous, since it forces
replacements that might not be valid from a dataflow perspective,
and uses and clobbers only exist for dataflow.  As said above,
I think the correct way of handling a failed simplification would
be to delete clobbers and replace uses of subregs with uses of
the inner register.  But I didn't want to write untested code
to do that.

In the PR, the clobber caused an infinite loop in DCE, because
of a disagreement about what effect the clobber had.  But for
the reasons above, I think that was GIGO rather than a bug in
DF or DCE.

Tested on aarch64-linux-gnu & x86_64-linux-gnu.  OK to install?

Richard


gcc/
PR rtl-optimization/108681
* lra-spills.cc (lra_final_code_change): Extend subreg replacement
code to handle bare uses and clobbers.

gcc/testsuite/
PR rtl-optimization/108681
* gcc.target/aarch64/pr108681.c: New test.
---
 gcc/lra-spills.cc   |  3 +++
 gcc/testsuite/gcc.target/aarch64/pr108681.c | 15 +++
 2 files changed, 18 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/pr108681.c

diff --git a/gcc/lra-spills.cc b/gcc/lra-spills.cc
index a8d7e60acd3..4af85c49d43 100644
--- a/gcc/lra-spills.cc
+++ b/gcc/lra-spills.cc
@@ -860,6 +860,9 @@ lra_final_code_change (void)
lra_update_dup (id, i);
insn_change_p = true;
  }
+ if ((GET_CODE (pat) == USE || GET_CODE (pat) == CLOBBER)
+ && alter_subregs ( (pat, 0), false))
+   insn_change_p = true;
  if (insn_change_p)
lra_update_operator_dups (id);
 
diff --git a/gcc/testsuite/gcc.target/aarch64/pr108681.c 
b/gcc/testsuite/gcc.target/aarch64/pr108681.c
new file mode 100644
index 000..2391eaac2f2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr108681.c
@@ -0,0 +1,15 @@
+/* { dg-options "-O" } */
+
+#pragma GCC aarch64 "arm_neon.h"
+typedef __Int64x1_t int64x1_t;
+void foo (int64x1x4_t);
+
+void
+bar (int64x1_t a)
+{
+  for (;;) {
+int64x1x4_t b;
+b.val[3] = a;
+foo (b);
+  }
+}
-- 
2.25.1



Re: [PING] Re: [PATCH 2/2] Corrected pr25521.c target matching.

2023-02-07 Thread Cupertino Miranda via Gcc-patches


Hi Jeff,

Can you please confirm if the patch is Ok?

Thanks,
Cupertino

> Cupertino Miranda via Gcc-patches writes:
>
>> Thank you for the comments and suggestions.
>> I have changed the patch.
>>
>> Unfortunately in case of rx target I could not make
>> scan-assembler-symbol-section to match. I believe it is because the
>> .section and .global entries order is reversed in this target.
>>
>> Patch in inlined below. looking forward to your comments.
>>
>> Cupertino
>>
>> diff --git a/gcc/testsuite/gcc.dg/pr25521.c b/gcc/testsuite/gcc.dg/pr25521.c
>> index 63363a03b9f..82b4cd88ec0 100644
>> --- a/gcc/testsuite/gcc.dg/pr25521.c
>> +++ b/gcc/testsuite/gcc.dg/pr25521.c
>> @@ -2,9 +2,10 @@
>> sections.
>>
>> { dg-require-effective-target elf }
>> -   { dg-do compile } */
>> +   { dg-do compile }
>> +   { dg-skip-if "" { ! const_volatile_readonly_section } } */
>>
>>  const volatile int foo = 30;
>>
>> -
>> -/* { dg-final { scan-assembler "\\.s\?rodata" } } */
>> +/* { dg-final { scan-assembler {.section C,} { target { rx-*-* } } } } */
>> +/* { dg-final { scan-assembler-symbol-section {^_?foo$} 
>> {^\.(const|s?rodata)} { target { ! "rx-*-*" } } } } */
>> diff --git a/gcc/testsuite/lib/target-supports.exp 
>> b/gcc/testsuite/lib/target-supports.exp
>> index c0694af2338..91aafd89909 100644
>> --- a/gcc/testsuite/lib/target-supports.exp
>> +++ b/gcc/testsuite/lib/target-supports.exp
>> @@ -12295,3 +12295,13 @@ proc check_is_prog_name_available { prog } {
>>
>>  return 1
>>  }
>> +
>> +# returns 1 if target does selects a readonly section for const volatile 
>> variables.
>> +proc check_effective_target_const_volatile_readonly_section { } {
>> +
>> +if { [istarget powerpc-*-*]
>> +  || [check-flags { "" { powerpc64-*-* } { -m32 } }] } {
>> +return 0
>> +}
>> +  return 1
>> +}
>>
>>
>> Jeff Law writes:
>>
>>> On 12/7/22 08:45, Cupertino Miranda wrote:

> On 12/2/22 10:52, Cupertino Miranda via Gcc-patches wrote:
>> This commit is a follow up of bugzilla #107181.
>> The commit /a0aafbc/ changed the default implementation of the
>> SELECT_SECTION hook in order to match clang/llvm behaviour w.r.t the
>> placement of `const volatile' objects.
>> However, the following targets use target-specific selection functions
>> and they choke on the testcase pr25521.c:
>>*rx - target sets its const variables as '.section C,"a",@progbits'.
> That's presumably a constant section.  We should instead twiddle the test 
> to
> recognize that section.
 Although @progbits is indeed a constant section, I believe it is
 more interesting to detect if the `rx' starts selecting more
 standard sections instead of the current @progbits.
 That was the reason why I opted to XFAIL instead of PASSing it.
 Can I keep it as such ?
>>> I'm not aware of any ongoing development for that port, so I would not let
>>> concerns about the rx port changing behavior dominate how we approach this
>>> problem.
>>>
>>> The rx port is using a different name for the section.  That's  valid thing 
>>> to
>>> do and to the extent we can, we should support that in the test rather than
>>> (incorrectly IMHO) xfailing the test just becuase the name isn't what we
>>> expected.
>>>
>>> To avoid over-eagerly matching, I would probably search for "C,"  I 
>>> wouldn't do
>>> that for the const or rodata sections as they often have a suffix like 1, 
>>> 2, 4,
>>> 8 for different sized rodata sections.
>>>
>>> PPC32 is explicitly doing something different and placing those objects 
>>> into an
>>> RW section.  So for PPC32 it makes more sense to skip the test rather than 
>>> xfail
>>> it.
>>>
>>> Jeff


Re: [PATCH] cgraph: Handle simd clones in cgraph_node::set_{const, pure}_flag [PR106433]

2023-02-07 Thread Richard Biener via Gcc-patches



> Am 07.02.2023 um 09:37 schrieb Jakub Jelinek :
> 
> Hi!
> 
> The following testcase ICEs, because we determine only in late pure const
> pass that bar is const (the content of the function loses a store to a
> global var during dse3 and read from it during cddce2) and local-pure-const2
> makes it const.  The cgraph ordering is that post IPA (in late IPA simd
> clones are created) bar is processed first, then foo as its caller, then
> foo.simdclone* and finally bar.simdclone*.  Conceptually I think that is the
> right ordering which allows for static simd clones to be removed.
> 
> The reason for the ICE is that because bar was marked const, the call to
> it lost vops before vectorization, and when we in foo.simdclone* try to
> vectorize the call to bar, we replace it with bar.simdclone* which hasn't
> been marked const and so needs vops, which we don't add.
> 
> Now, because the simd clones are created from the same IL, just in a loop
> with different argument/return value passing, I think generally if the base
> function is determined to be const or pure, the simd clones should be too,
> unless e.g. the vectorization causes different optimization decisions, but
> then still the global memory reads if any shouldn't affect what the function
> does and global memory stores shouldn't be reachable at runtime.
> 
> So, the following patch changes set_{const,pure}_flag to mark also simd
> clones.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok,
Thanks,
Richard 


> 2023-02-07  Jakub Jelinek  
> 
>PR tree-optimization/106433
>* cgraph.cc (set_const_flag_1): Recurse on simd clones too.
>(cgraph_node::set_pure_flag): Call set_pure_flag_1 on simd clones too.
> 
>* gcc.c-torture/compile/pr106433.c: New test.
> 
> --- gcc/cgraph.cc.jj2023-02-02 10:54:44.327473492 +0100
> +++ gcc/cgraph.cc2023-02-06 12:28:22.040593063 +0100
> @@ -2764,6 +2764,9 @@ set_const_flag_1 (cgraph_node *node, boo
>   if (!set_const || alias->get_availability () > AVAIL_INTERPOSABLE)
>set_const_flag_1 (alias, set_const, looping, changed);
> }
> +  for (struct cgraph_node *n = node->simd_clones; n != NULL;
> +   n = n->simdclone->next_clone)
> +set_const_flag_1 (n, set_const, looping, changed);
>   for (cgraph_edge *e = node->callers; e; e = e->next_caller)
> if (e->caller->thunk
>&& (!set_const || e->caller->get_availability () > AVAIL_INTERPOSABLE))
> @@ -2876,6 +2879,9 @@ cgraph_node::set_pure_flag (bool pure, b
> {
>   struct set_pure_flag_info info = {pure, looping, false};
>   call_for_symbol_thunks_and_aliases (set_pure_flag_1, , !pure, true);
> +  for (struct cgraph_node *n = simd_clones; n != NULL;
> +   n = n->simdclone->next_clone)
> +set_pure_flag_1 (n, );
>   return info.changed;
> }
> 
> --- gcc/testsuite/gcc.c-torture/compile/pr106433.c.jj2023-02-06 
> 12:37:26.963748811 +0100
> +++ gcc/testsuite/gcc.c-torture/compile/pr106433.c2023-02-06 
> 12:37:06.631041918 +0100
> @@ -0,0 +1,24 @@
> +/* PR tree-optimization/106433 */
> +
> +int m, *p;
> +
> +__attribute__ ((simd)) int
> +bar (int x)
> +{
> +  if (x)
> +{
> +  if (m < 1)
> +for (m = 0; m < 1; ++m)
> +  ++x;
> +  p = 
> +  for (;;)
> +++m;
> +}
> +  return 0;
> +}
> +
> +__attribute__ ((simd)) int
> +foo (int x)
> +{
> +  return bar (x);
> +}
> 
>Jakub
> 


Re: [PATCH] ipa-split: Don't split returns_twice functions [PR106923]

2023-02-07 Thread Richard Biener via Gcc-patches



> Am 07.02.2023 um 09:42 schrieb Jakub Jelinek via Gcc-patches 
> :
> 
> Hi!
> 
> As discussed in the PR, returns_twice functions are rare/special beasts
> that need special treatment in the cfg, and inside of their bodies
> we don't know which part actually works the weird returns twice way
> (either in the fork/vfork sense, or in the setjmp) and aren't updating
> ab edges to reflect that.
> 
> I think easiest is just to never split these, like we already never
> split noreturn or malloc functions.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Richard 

> 2023-02-07  Jakub Jelinek  
> 
>PR tree-optimization/106923
>* ipa-split.cc (execute_split_functions): Don't split returns_twice
>functions.
> 
>* gcc.dg/pr106923.c: New test.
> 
> --- gcc/ipa-split.cc.jj2023-01-02 09:32:22.492283737 +0100
> +++ gcc/ipa-split.cc2023-02-06 14:15:08.343271251 +0100
> @@ -1715,10 +1715,11 @@ execute_split_functions (void)
>   struct cgraph_node *node = cgraph_node::get (current_function_decl);
> 
>   if (flags_from_decl_or_type (current_function_decl)
> -  & (ECF_NORETURN|ECF_MALLOC))
> +  & (ECF_NORETURN|ECF_MALLOC|ECF_RETURNS_TWICE))
> {
>   if (dump_file)
> -fprintf (dump_file, "Not splitting: noreturn/malloc function.\n");
> +fprintf (dump_file, "Not splitting: noreturn/malloc/returns_twice "
> +"function.\n");
>   return 0;
> }
>   if (MAIN_NAME_P (DECL_NAME (current_function_decl)))
> --- gcc/testsuite/gcc.dg/pr106923.c.jj2023-02-06 14:19:33.464449400 +0100
> +++ gcc/testsuite/gcc.dg/pr106923.c2023-02-06 14:18:25.498429162 +0100
> @@ -0,0 +1,23 @@
> +/* PR tree-optimization/106923 */
> +/* { dg-do compile } */
> +/* { dg-options "-O1 -finline-small-functions -fpartial-inlining --param 
> max-inline-insns-single=1 --param uninlined-function-insns=1" } */
> +
> +int n;
> +
> +int
> +baz (void);
> +
> +__attribute__ ((returns_twice)) int
> +bar (void)
> +{
> +  if (baz ())
> +++n;
> +
> +  return 0;
> +}
> +
> +int
> +foo (void)
> +{
> +  return bar ();
> +}
> 
>Jakub
> 


[committed] testsuite: Expect -Wdeprecated warning in warn/Wstrict-aliasing-bogus-union-2.C for C++23

2023-02-07 Thread Jakub Jelinek via Gcc-patches
Hi!

On Mon, Feb 06, 2023 at 02:26:01PM +, Jonathan Wakely via Gcc-patches wrote:
> With the recent change to deprecate std::aligned_storage and
> std::aligned_union we need to adjust some tests that now fail with
> -std=c++23.

The g++.dg/warn/Wstrict-aliasing-bogus-union-2.C test is also affected:
PASS: g++.dg/warn/Wstrict-aliasing-bogus-union-2.C  -std=gnu++2b  (test for 
bogus messages, line 12)
FAIL: g++.dg/warn/Wstrict-aliasing-bogus-union-2.C  -std=gnu++2b (test for 
excess errors)
Excess errors:
.../gcc/testsuite/g++.dg/warn/Wstrict-aliasing-bogus-union-2.C:8:8: warning: 
'template struct 
std::aligned_storage' is deprecated [-Wdeprecated-declarations]

The following patch adds dg-warning for it.

Tested on x86_64-linux -m32/-m64 with GXX_TESTSUITE_STDS=98,11,14,17,20,2b ,
committed to trunk as obvious.

2023-02-07  Jakub Jelinek  

* g++.dg/warn/Wstrict-aliasing-bogus-union-2.C: Expect
-Wdeprecated warning for C++23.

--- gcc/testsuite/g++.dg/warn/Wstrict-aliasing-bogus-union-2.C.jj   
2020-01-12 11:54:37.286400238 +0100
+++ gcc/testsuite/g++.dg/warn/Wstrict-aliasing-bogus-union-2.C  2023-02-07 
10:21:47.616642483 +0100
@@ -5,7 +5,7 @@
 
 struct foo
 {
-  std::aligned_storage::type raw;
+  std::aligned_storage::type raw; /* { dg-warning 
"deprecated" "" { target c++23 } } */
 
   long& cooked()
 {

Jakub



[PATCH] ipa-split: Don't split returns_twice functions [PR106923]

2023-02-07 Thread Jakub Jelinek via Gcc-patches
Hi!

As discussed in the PR, returns_twice functions are rare/special beasts
that need special treatment in the cfg, and inside of their bodies
we don't know which part actually works the weird returns twice way
(either in the fork/vfork sense, or in the setjmp) and aren't updating
ab edges to reflect that.

I think easiest is just to never split these, like we already never
split noreturn or malloc functions.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2023-02-07  Jakub Jelinek  

PR tree-optimization/106923
* ipa-split.cc (execute_split_functions): Don't split returns_twice
functions.

* gcc.dg/pr106923.c: New test.

--- gcc/ipa-split.cc.jj 2023-01-02 09:32:22.492283737 +0100
+++ gcc/ipa-split.cc2023-02-06 14:15:08.343271251 +0100
@@ -1715,10 +1715,11 @@ execute_split_functions (void)
   struct cgraph_node *node = cgraph_node::get (current_function_decl);
 
   if (flags_from_decl_or_type (current_function_decl)
-  & (ECF_NORETURN|ECF_MALLOC))
+  & (ECF_NORETURN|ECF_MALLOC|ECF_RETURNS_TWICE))
 {
   if (dump_file)
-   fprintf (dump_file, "Not splitting: noreturn/malloc function.\n");
+   fprintf (dump_file, "Not splitting: noreturn/malloc/returns_twice "
+   "function.\n");
   return 0;
 }
   if (MAIN_NAME_P (DECL_NAME (current_function_decl)))
--- gcc/testsuite/gcc.dg/pr106923.c.jj  2023-02-06 14:19:33.464449400 +0100
+++ gcc/testsuite/gcc.dg/pr106923.c 2023-02-06 14:18:25.498429162 +0100
@@ -0,0 +1,23 @@
+/* PR tree-optimization/106923 */
+/* { dg-do compile } */
+/* { dg-options "-O1 -finline-small-functions -fpartial-inlining --param 
max-inline-insns-single=1 --param uninlined-function-insns=1" } */
+
+int n;
+
+int
+baz (void);
+
+__attribute__ ((returns_twice)) int
+bar (void)
+{
+  if (baz ())
+++n;
+
+  return 0;
+}
+
+int
+foo (void)
+{
+  return bar ();
+}

Jakub



[PATCH] cgraph: Handle simd clones in cgraph_node::set_{const,pure}_flag [PR106433]

2023-02-07 Thread Jakub Jelinek via Gcc-patches
Hi!

The following testcase ICEs, because we determine only in late pure const
pass that bar is const (the content of the function loses a store to a
global var during dse3 and read from it during cddce2) and local-pure-const2
makes it const.  The cgraph ordering is that post IPA (in late IPA simd
clones are created) bar is processed first, then foo as its caller, then
foo.simdclone* and finally bar.simdclone*.  Conceptually I think that is the
right ordering which allows for static simd clones to be removed.

The reason for the ICE is that because bar was marked const, the call to
it lost vops before vectorization, and when we in foo.simdclone* try to
vectorize the call to bar, we replace it with bar.simdclone* which hasn't
been marked const and so needs vops, which we don't add.

Now, because the simd clones are created from the same IL, just in a loop
with different argument/return value passing, I think generally if the base
function is determined to be const or pure, the simd clones should be too,
unless e.g. the vectorization causes different optimization decisions, but
then still the global memory reads if any shouldn't affect what the function
does and global memory stores shouldn't be reachable at runtime.

So, the following patch changes set_{const,pure}_flag to mark also simd
clones.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2023-02-07  Jakub Jelinek  

PR tree-optimization/106433
* cgraph.cc (set_const_flag_1): Recurse on simd clones too.
(cgraph_node::set_pure_flag): Call set_pure_flag_1 on simd clones too.

* gcc.c-torture/compile/pr106433.c: New test.

--- gcc/cgraph.cc.jj2023-02-02 10:54:44.327473492 +0100
+++ gcc/cgraph.cc   2023-02-06 12:28:22.040593063 +0100
@@ -2764,6 +2764,9 @@ set_const_flag_1 (cgraph_node *node, boo
   if (!set_const || alias->get_availability () > AVAIL_INTERPOSABLE)
set_const_flag_1 (alias, set_const, looping, changed);
 }
+  for (struct cgraph_node *n = node->simd_clones; n != NULL;
+   n = n->simdclone->next_clone)
+set_const_flag_1 (n, set_const, looping, changed);
   for (cgraph_edge *e = node->callers; e; e = e->next_caller)
 if (e->caller->thunk
&& (!set_const || e->caller->get_availability () > AVAIL_INTERPOSABLE))
@@ -2876,6 +2879,9 @@ cgraph_node::set_pure_flag (bool pure, b
 {
   struct set_pure_flag_info info = {pure, looping, false};
   call_for_symbol_thunks_and_aliases (set_pure_flag_1, , !pure, true);
+  for (struct cgraph_node *n = simd_clones; n != NULL;
+   n = n->simdclone->next_clone)
+set_pure_flag_1 (n, );
   return info.changed;
 }
 
--- gcc/testsuite/gcc.c-torture/compile/pr106433.c.jj   2023-02-06 
12:37:26.963748811 +0100
+++ gcc/testsuite/gcc.c-torture/compile/pr106433.c  2023-02-06 
12:37:06.631041918 +0100
@@ -0,0 +1,24 @@
+/* PR tree-optimization/106433 */
+
+int m, *p;
+
+__attribute__ ((simd)) int
+bar (int x)
+{
+  if (x)
+{
+  if (m < 1)
+for (m = 0; m < 1; ++m)
+  ++x;
+  p = 
+  for (;;)
+++m;
+}
+  return 0;
+}
+
+__attribute__ ((simd)) int
+foo (int x)
+{
+  return bar (x);
+}

Jakub



Re: [DOC PATCH] Document the VEC_PERM_EXPR tree code (and minor clean-ups).

2023-02-07 Thread Prathamesh Kulkarni via Gcc-patches
On Mon, 6 Feb 2023 at 20:14, Roger Sayle  wrote:
>
>
> Perhaps I'm missing something (I'm not too familiar with SVE semantics), but
> is there
> a reason that the solution for PR96473 uses a VEC_PERM_EXPR and not just a
> VEC_DUPLICATE_EXPR?  The folding of sv1d1rq (svptrue_..., ...) doesn't seem
> to
> require either the blending or the permutation functionality of a
> VEC_PERM_EXPR.
> Instead, it seems to be misusing (the modified) VEC_PERM_EXPR as a form of
> VIEW_CONVERT_EXPR that allows us to convert/mismatch the type of the
> operands
> to the type of the result.
Hi,
I am not sure if we could use VEC_DUPLICATE_EXPR for PR96463 case as-is.
Perhaps we could extend VEC_DUPLICATE_EXPR to take N operands,
so the resulting vector has npatterns = N, nelts_per_pattern = 1 ?
AFAIU, extending VEC_PERM_EXPR to handle vectors with different lengths,
would allow for more optimization opportunities besides PR96463.
>
> Conceptually, (as in Richard's original motivation for the PR),
> svint32_t foo (int32x4_t x) { return svld1rq (svptrue_b8 (), [0]); }
> can be optimized to (something like)
> svint32_t foo (int32x4_t x) { return svdup_32 (x[0]); }  // or dup z0.q,
> z0.q[0] equivalent
I guess that should be equivalent to svdupq_s32 (x[0], x[1], x[2], x[3]) ?

Thanks,
Prathamesh



> hence it makes sense for fold to transform the gimple form of the first,
> into the
> gimple form of the second(?)
>
> Just curious.
> Roger
> --
>
> > -Original Message-
> > From: Richard Sandiford 
> > Sent: 06 February 2023 12:22
> > To: Richard Biener 
> > Cc: Roger Sayle ; GCC Patches  > patc...@gcc.gnu.org>
> > Subject: Re: [DOC PATCH] Document the VEC_PERM_EXPR tree code (and minor
> > clean-ups).
> >
> > Richard Biener  writes:
> > > On Sat, Feb 4, 2023 at 9:35 PM Roger Sayle 
> > wrote:
> > >>
> > >>
> > >> This patch (primarily) documents the VEC_PERM_EXPR tree code in
> > >> generic.texi.  For ease of review, it is provided below as a pair of
> > >> diffs.  The first contains just the new text added to describe
> > >> VEC_PERM_EXPR, the second tidies up this part of the documentation by
> > >> sorting the tree codes into alphabetical order, and providing
> > >> consistent section naming/capitalization, so changing this section
> > >> from "Vectors" to "Vector Expressions" (matching the nearby "Unary
> > >> and Binary Expressions").
> > >>
> > >> Tested with make pdf and make html on x86_64-pc-linux-gnu.
> > >> The reviewer(s) can decide whether to approve just the new content,
> > >> or the content+clean-up.  Ok for mainline?
> > >
> > > +@item VEC_PERM_EXPR
> > > +This node represents a vector permute/blend operation.  The three
> > > +operands must be vectors of the same number of elements.  The first
> > > +and second operands must be vectors of the same type as the entire
> > > +expression,
> > >
> > > this was recently relaxed for the case of constant permutes in which
> > > case the first and second operands only have to have the same element
> > > type as the result.  See tree-cfg.cc:verify_gimple_assign_ternary.
> > >
> > > The following description will become a bit more awkward here and for
> > > rhs1/rhs2 with different number of elements the modulo interpretation
> > > doesn't hold - I believe we require in-bounds elements for constant
> > > permutes.  Richard can probably clarify things here.
> >
> > I thought that the modulo behaviour still applies when the node has a
> constant
> > selector, it's just that the in-range form is the canonical one.
> >
> > With variable-length vectors, I think it's in principle possible to have a
> stepped
> > constant selector whose start elements are in-range but whose final
> elements
> > aren't (and instead wrap around when applied).
> > E.g. the selector could zip the last quarter of the inputs followed by the
> first
> > quarter.
> >
> > Thanks,
> > Richard
>