date:20230519

[Bug rtl-optimization/105753] [avr] ICE: in add_clobbers, at config/avr/avr-dimode.md:2705

2023-05-19 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105753

--- Comment #19 from CVS Commits  ---
The master branch has been updated by Georg-Johann Lay :

https://gcc.gnu.org/g:80348e6aec44966e20ca1ca823247ce1381071eb

commit r14-1016-g80348e6aec44966e20ca1ca823247ce1381071eb
Author: Triffid Hunter 
Date:   Sat May 20 07:50:00 2023 +0200

target/105753: Fix ICE in add_clobbers due to extra PARALLEL in insn.

This patch removes the superfluous parallel in [u]divmod patterns in
the AVR backend.  Effect of extra parallel is that add_clobbers reaches
gcc_unreachable() because the clobbers for [u]divmod are missing.
If an insn has multiple parts like clobbers, the parallel around the
parts of the insn pattern is implicit.

gcc/
PR target/105753
* config/avr/avr.md (divmodpsi, udivmodpsi, divmodsi, udivmodsi):
Remove superfluous "parallel" in insn pattern.
([u]divmod4): Tidy code.  Use gcc_unreachable() instead of
printing error text to assembly.

gcc/testsuite/
PR target/105753
* gcc.target/avr/torture/pr105753.c: New test.

[Bug middle-end/109907] Missed optimization for bit extraction (uses shift instead of single bit-test)

2023-05-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109907

Andrew Pinski  changed:

   What|Removed |Added

 Status|ASSIGNED|NEW
   Assignee|pinskia at gcc dot gnu.org |unassigned at gcc dot 
gnu.org

--- Comment #10 from Andrew Pinski  ---
I applied the patches now after approval, r14-1014-gc5df248509b48 is the one
that makes the difference here.
I am not working on improving the ^1 part though so leaving it open for that.

[Bug target/55181] [10/11/12/13/14 Regression] Expensive shift loop where a bit-testing instruction could be used

2023-05-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55181

--- Comment #28 from Andrew Pinski  ---
I forgot to mention this was fixed by r14-1014-gc5df248509b48 .

[Bug target/55181] [10/11/12/13/14 Regression] Expensive shift loop where a bit-testing instruction could be used

2023-05-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55181

Andrew Pinski  changed:

   What|Removed |Added

 Target||avr
   Target Milestone|10.5|14.0
 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #27 from Andrew Pinski  ---
This is now fixed on the trunk for GCC 14. I have no plans on backporting the
patches.

Re: [PATCH] [RISC-V] Fix riscv_expand_conditional_move.

2023-05-19 Thread Jeff Law via Gcc-patches





On 4/27/23 20:21, Die Li wrote:

Two issues have been observed in current riscv_expand_conditional_move
implementation.
1. Before introduction of TARGET_XTHEADCONDMOV, op0 of comparision expression
is used for mode comparision with word_mode, but after TARGET_XTHEADCONDMOV
megered with TARGET_SFB_ALU, dest of if-then-else is used for mode comparision 
with
word_mode, and from md file mode of dest is DI or SI which can be different with
word_mode in RV64.

2. TARGET_XTHEADCONDMOV cannot be generated when the mode of the comparison is 
E_VOID.

This patch solves the issues above.

Provide an example from the newly added test case.

Testcase:
int ConNmv_reg_reg_reg(int x, int y, int z, int n){
   if (x != y) return z;
   return n;
}

Cflags:
-O2 -march=rv64gc_xtheadcondmov -mabi=lp64d

before patch:
ConNmv_reg_reg_reg:
bne a0,a1,.L23
mv  a2,a3
.L23:
mv  a0,a2
ret

after patch:
ConNmv_reg_reg_reg:
sub a1,a0,a1
th.mveqza2,zero,a1
th.mvneza3,zero,a1
or  a0,a2,a3
ret

Co-Authored by: Fei Gao 
Signed-off-by: Die Li 

gcc/ChangeLog:

 * config/riscv/riscv.cc (riscv_expand_conditional_move): Fix mode 
checking.

gcc/testsuite/ChangeLog:

 * gcc.target/riscv/xtheadcondmov-indirect-rv32.c: New test.
 * gcc.target/riscv/xtheadcondmov-indirect-rv64.c: New test.
---
  gcc/config/riscv/riscv.cc |   4 +-
  .../riscv/xtheadcondmov-indirect-rv32.c   | 116 ++
  .../riscv/xtheadcondmov-indirect-rv64.c   | 116 ++
  3 files changed, 234 insertions(+), 2 deletions(-)
  create mode 100644 
gcc/testsuite/gcc.target/riscv/xtheadcondmov-indirect-rv32.c
  create mode 100644 
gcc/testsuite/gcc.target/riscv/xtheadcondmov-indirect-rv64.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 1529855a2b4..30ace45dc5f 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -3411,7 +3411,7 @@ riscv_expand_conditional_move (rtx dest, rtx op, rtx 
cons, rtx alt)
&& GET_MODE_CLASS (mode) == MODE_INT
&& reg_or_0_operand (cons, mode)
&& reg_or_0_operand (alt, mode)
-  && GET_MODE (op) == mode
+  && (GET_MODE (op) == mode || GET_MODE (op) == E_VOIDmode)
So I nearly suggested we just drop this check.  In general comparisons 
don't have modes.  But I don't think it's going to hurt and it lines up 
with the predicates that test for conditions.


Note that some of the new tests are still failing (though they certainly 
do much better after your patch)

.
  FAIL: gcc.target/riscv/xtheadcondmov-indirect-rv32.c   -O1   check-function-bodies ConNmv_imm_imm_r >   FAIL: gcc.target/riscv/xtheadcondmov-indirect-rv32.c   -O2 

check-function-bodies ConNmv_imm_imm_reg

  FAIL: gcc.target/riscv/xtheadcondmov-indirect-rv32.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none   check-function-bodies 
ConNmv_imm_imm_reg
  FAIL: gcc.target/riscv/xtheadcondmov-indirect-rv32.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects   check-function-bodies 
ConNmv_imm_imm_reg
  FAIL: gcc.target/riscv/xtheadcondmov-indirect-rv32.c   -O3 -g   
check-function-bodies ConNmv_imm_imm_reg



[ ... and a few more instances omitted ... ]

I went ahead and pushed the patch, but you might want to double-check 
the state of those failing tests.


Jeff

Re: [PATCH 7/7] Expand directly for single bit test

2023-05-19 Thread Jeff Law via Gcc-patches





On 5/19/23 20:14, Andrew Pinski via Gcc-patches wrote:

Instead of using creating trees to the expansion,
just expand directly which makes the code a little simplier
but also reduces how much GC memory will be used during the expansion.

OK? Bootstrapped and tested on x86_64-linux.

gcc/ChangeLog:

* expr.cc (fold_single_bit_test): Rename to ...
(expand_single_bit_test): This and expand directly.
(do_store_flag): Update for the rename function.

OK.

jeff

[PATCH] RISC-V: Add RVV comparison autovectorization

2023-05-19 Thread juzhe . zhong

From: Juzhe-Zhong 

This patch enable RVV auto-vectorization including floating-point
unorder and order comparison.

The testcases are leveraged from Richard.
So include Richard as co-author.

Co-Authored-By: Richard Sandiford 

gcc/ChangeLog:

* config/riscv/autovec.md (vcond): New pattern.
(vcondu): Ditto.
(vcond): Ditto.
(vec_cmp): Ditto.
(vec_cmpu): Ditto.
(vcond_mask_): Ditto.
* config/riscv/riscv-protos.h (expand_vec_cmp_int): New function.
(expand_vec_cmp_float): New function.
(expand_vcond): New function.
(emit_merge_op): Adapt function.
* config/riscv/riscv-v.cc (emit_pred_op): Ditto.
(emit_pred_binop): Ditto.
(emit_pred_unop): New function.
(emit_len_binop): Adapt function.
(emit_len_unop): New function.
(emit_index_op): Adapt function.
(emit_merge_op): Ditto.
(expand_vcond): New function.
(emit_pred_cmp): Ditto.
(emit_len_cmp): Ditto.
(expand_vec_cmp_int): Ditto.
(expand_vec_cmp_float): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/rvv.exp:
* gcc.target/riscv/rvv/autovec/cmp/vcond-1.c: New test.
* gcc.target/riscv/rvv/autovec/cmp/vcond-2.c: New test.
* gcc.target/riscv/rvv/autovec/cmp/vcond-3.c: New test.
* gcc.target/riscv/rvv/autovec/cmp/vcond_run-1.c: New test.
* gcc.target/riscv/rvv/autovec/cmp/vcond_run-2.c: New test.
* gcc.target/riscv/rvv/autovec/cmp/vcond_run-3.c: New test.

---
 gcc/config/riscv/autovec.md   | 141 +
 gcc/config/riscv/riscv-protos.h   |   4 +
 gcc/config/riscv/riscv-v.cc   | 482 --
 .../riscv/rvv/autovec/cmp/vcond-1.c   | 157 ++
 .../riscv/rvv/autovec/cmp/vcond-2.c   |  75 +++
 .../riscv/rvv/autovec/cmp/vcond-3.c   |  13 +
 .../riscv/rvv/autovec/cmp/vcond_run-1.c   |  49 ++
 .../riscv/rvv/autovec/cmp/vcond_run-2.c   |  76 +++
 .../riscv/rvv/autovec/cmp/vcond_run-3.c   |   6 +
 gcc/testsuite/gcc.target/riscv/rvv/rvv.exp|   2 +
 10 files changed, 970 insertions(+), 35 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cmp/vcond-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cmp/vcond-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cmp/vcond-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cmp/vcond_run-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cmp/vcond_run-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cmp/vcond_run-3.c

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index ce0b46537ad..5d8ba66f0c3 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -180,3 +180,144 @@
NULL_RTX, mode);
   DONE;
 })
+
+;; =
+;; == Comparisons and selects
+;; =
+
+;; -
+;;  [INT,FP] Compare and select
+;; -
+;; The patterns in this section are synthetic.
+;; -
+
+;; Integer (signed) vcond.  Don't enforce an immediate range here, since it
+;; depends on the comparison; leave it to riscv_vector::expand_vcond instead.
+(define_expand "vcond"
+  [(set (match_operand:V 0 "register_operand")
+   (if_then_else:V
+ (match_operator 3 "comparison_operator"
+   [(match_operand:VI 4 "register_operand")
+(match_operand:VI 5 "nonmemory_operand")])
+ (match_operand:V 1 "nonmemory_operand")
+ (match_operand:V 2 "nonmemory_operand")))]
+  "TARGET_VECTOR && known_eq (GET_MODE_NUNITS (mode),
+   GET_MODE_NUNITS (mode))"
+  {
+riscv_vector::expand_vcond (mode, operands);
+DONE;
+  }
+)
+
+;; Integer vcondu.  Don't enforce an immediate range here, since it
+;; depends on the comparison; leave it to riscv_vector::expand_vcond instead.
+(define_expand "vcondu"
+  [(set (match_operand:V 0 "register_operand")
+   (if_then_else:V
+ (match_operator 3 "comparison_operator"
+   [(match_operand:VI 4 "register_operand")
+(match_operand:VI 5 "nonmemory_operand")])
+ (match_operand:V 1 "nonmemory_operand")
+ (match_operand:V 2 "nonmemory_operand")))]
+  "TARGET_VECTOR && known_eq (GET_MODE_NUNITS (mode),
+   GET_MODE_NUNITS (mode))"
+  {
+riscv_vector::expand_vcond (mode, operands);
+DONE;
+  }
+)
+
+;; Floating-point vcond.  Don't enforce an immediate range here, since it
+;; depends on the comparison; leave it to riscv_vector::expand_vcond instead.
+(define_expand "vcond"
+  [(set

Re: [PATCH 6/7] Use BIT_FIELD_REF inside fold_single_bit_test

2023-05-19 Thread Jeff Law via Gcc-patches





On 5/19/23 20:14, Andrew Pinski via Gcc-patches wrote:

Instead of depending on combine to do the extraction,
Let's create a tree which will expand directly into
the extraction. This improves code generation on some
targets.

OK? Bootstrapped and tested on x86_64-linux.

gcc/ChangeLog:

* expr.cc (fold_single_bit_test): Use BIT_FIELD_REF
instead of shift/and.

OK.
jeff

Re: [PATCH 5/7] Simplify fold_single_bit_test with respect to code

2023-05-19 Thread Jeff Law via Gcc-patches





On 5/19/23 20:14, Andrew Pinski via Gcc-patches wrote:

Since we know that fold_single_bit_test is now only passed
NE_EXPR or EQ_EXPR, we can simplify it and just use a gcc_assert
to assert that is the code that is being passed.

OK? Bootstrapped and tested on x86_64-linux.

gcc/ChangeLog:

* expr.cc (fold_single_bit_test): Add an assert
and simplify based on code being NE_EXPR or EQ_EXPR.

OK.
jeff

Re: [PATCH 4/7] Simplify fold_single_bit_test slightly

2023-05-19 Thread Jeff Law via Gcc-patches





On 5/19/23 20:14, Andrew Pinski via Gcc-patches wrote:

Now the only use of fold_single_bit_test is in do_store_flag,
we can change it such that to pass the inner arg and bitnum
instead of building a tree. There is no code generation changes
due to this change, only a decrease in GC memory that is produced
during expansion.

OK? Bootstrapped and tested on x86_64-linux.

gcc/ChangeLog:

* expr.cc (fold_single_bit_test): Take inner and bitnum
instead of arg0 and arg1. Update the code.
(do_store_flag): Don't create a tree when calling
fold_single_bit_test instead just call it with the bitnum
and the inner tree.

OK.
jeff

Re: [PATCH 3/7] Use get_def_for_expr in fold_single_bit_test

2023-05-19 Thread Jeff Law via Gcc-patches





On 5/19/23 20:14, Andrew Pinski via Gcc-patches wrote:

The code in fold_single_bit_test, checks if
the inner was a right shift and improve the bitnum
based on that. But since the inner will always be a
SSA_NAME at this point, the code is dead. Move it over
to use the helper function get_def_for_expr instead.

OK? Bootstrapped and tested on x86_64-linux.

gcc/ChangeLog:

* expr.cc (fold_single_bit_test): Use get_def_for_expr
instead of checking the inner's code.

OK.
jeff

Re: [PATCH 2/7] Inline and simplify fold_single_bit_test_into_sign_test into fold_single_bit_test

2023-05-19 Thread Jeff Law via Gcc-patches





On 5/19/23 20:14, Andrew Pinski via Gcc-patches wrote:

Since the last use of fold_single_bit_test is fold_single_bit_test,
we can inline it and even simplify the inlined version. This has
no behavior change.

OK? Bootstrapped and tested on x86_64-linux.

gcc/ChangeLog:

* expr.cc (fold_single_bit_test_into_sign_test): Inline into ...
(fold_single_bit_test): This and simplify.

Just to be clear, based on the NFC assumption, this is OK for the trunk.
jeff

Re: [PATCH 2/7] Inline and simplify fold_single_bit_test_into_sign_test into fold_single_bit_test

2023-05-19 Thread Jeff Law via Gcc-patches





On 5/19/23 20:14, Andrew Pinski via Gcc-patches wrote:

Since the last use of fold_single_bit_test is fold_single_bit_test,
we can inline it and even simplify the inlined version. This has
no behavior change.

OK? Bootstrapped and tested on x86_64-linux.

gcc/ChangeLog:

* expr.cc (fold_single_bit_test_into_sign_test): Inline into ...
(fold_single_bit_test): This and simplify.
Going to trust the inlining and simpification is really NFC.  It's not 
really obvious from the patch.


jeff

Re: [PATCH 1/7] Move fold_single_bit_test to expr.cc from fold-const.cc

2023-05-19 Thread Jeff Law via Gcc-patches





On 5/19/23 20:14, Andrew Pinski via Gcc-patches wrote:

This is part 1 of N patch set that will change the expansion
of `(A & C) != 0` from using trees to directly expanding so later
on we can do some cost analysis.

Since the only user of fold_single_bit_test is now
expand, move it to there.

OK? Bootstrapped and tested on x86_64-linux.

gcc/ChangeLog:

* fold-const.cc (fold_single_bit_test_into_sign_test): Move to
expr.cc.
(fold_single_bit_test): Likewise.
* expr.cc (fold_single_bit_test_into_sign_test): Move from fold-const.cc
(fold_single_bit_test): Likewise and make static.
* fold-const.h (fold_single_bit_test): Remove declaration.

I'm assuming this is purely moving the bits around.

OK.

jeff

[Bug c/60090] For expression without ~, gcc -O1 emits "comparison of promoted ~unsigned with unsigned"

2023-05-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60090

--- Comment #7 from Andrew Pinski  ---
This one still happens on the trunk even with PR 107465 fixed. The reason is
because even though a warning here is correct, it is not wanted due to
requiring constant folding. Note you can get also the incorrect warning wording
at -O0 with constexpr in GCC 13+ (and -std=c2x).

[Bug c/107465] [10 Regression] Bogus warning: promoted bitwise complement of an unsigned value is always nonzero

2023-05-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107465

Andrew Pinski  changed:

   What|Removed |Added

 CC||jmattsson at dius dot com.au

--- Comment #22 from Andrew Pinski  ---
*** Bug 59098 has been marked as a duplicate of this bug. ***

[Bug c/59098] Unwarranted warning: promoted ~unsigned is always non-zero [-Wsign-compare]

2023-05-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59098

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|NEW |RESOLVED

--- Comment #5 from Andrew Pinski  ---
Marking as a dup of bug 107465 as that is what fixed the issue here.

*** This bug has been marked as a duplicate of bug 107465 ***

[Bug c/107465] [10 Regression] Bogus warning: promoted bitwise complement of an unsigned value is always nonzero

2023-05-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107465

Andrew Pinski  changed:

   What|Removed |Added

 CC||fredrik.hederstierna@securi
   ||tas-direct.com

--- Comment #21 from Andrew Pinski  ---
*** Bug 38341 has been marked as a duplicate of this bug. ***

[Bug c/38341] Wrong warning comparison of promoted ~unsigned with unsigned

2023-05-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38341

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #13 from Andrew Pinski  ---
So this has been fixed on all of the active branches. Since PR 107465 was the
one recorded in the changelog, closing as a dup of that one.

*** This bug has been marked as a duplicate of bug 107465 ***

Re: [PATCH] Mode-Switching: Fix local array maybe uninitialized warning

2023-05-19 Thread Jeff Law via Gcc-patches





On 5/19/23 17:56, pan2...@intel.com wrote:

From: Pan Li 

There are 2 local array in function optimize_mode_switching. It will be
initialized conditionally at the beginning but then always consumed in
another loop. It may trigger the warning maybe-uninitialized, and may
result in build failure when enable werror, aka warning as error.

This patch will initialize the local array to zero explictly when
declaration.

Signed-off-by: Pan Li 

gcc/ChangeLog:

* mode-switching.cc (entity_map): Initialize the array to zero.
(bb_info): Ditto.

OK.
jeff

[Bug c/52050] Want an option to warn about a declaration inside a for/while/if statements.

2023-05-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52050

Andrew Pinski  changed:

   What|Removed |Added

 Blocks||87403

--- Comment #7 from Andrew Pinski  ---
This is now warning with -Wc90-c99-compat (since GCC 9). Though it does not
have its own option.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87403
[Bug 87403] [Meta-bug] Issues that suggest a new warning

[Bug c++/66555] Fails to warn for if (j == 0 && i == i)

2023-05-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66555

Andrew Pinski  changed:

   What|Removed |Added

 CC||trt at alumni dot duke.edu

--- Comment #4 from Andrew Pinski  ---
*** Bug 17534 has been marked as a duplicate of this bug. ***

[Bug c/17534] gcc fails to diagnose suspect expressions that have incompatible bit masks

2023-05-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=17534

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |6.0
 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #10 from Andrew Pinski  ---
Fixed for GCC 6 by r6-2453-g05b28fd6f91016 (aka PR 66555) so just marking as a
dup of that bug.

*** This bug has been marked as a duplicate of bug 66555 ***

[Bug middle-end/49617] gcc misses uninititialized variables in contained functions

2023-05-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49617

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed|2011-07-04 09:48:32 |2023-5-19
  Component|c   |middle-end

--- Comment #2 from Andrew Pinski  ---
Hmm, for the C testcase with GCC 5, we do get a warning:

: In function 'main':
:11:6: warning: 'FRAME.0.y' is used uninitialized in this function
[-Wuninitialized]
x = y;
  ^

There is no warning in GCC 6+ though.

Plus the diagnostic mentions FRAME.0. which is not in the original source.

[Bug c/20110] format checking and non-ASCII character sets

2023-05-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=20110

Andrew Pinski  changed:

   What|Removed |Added

 CC||bonzini at gnu dot org

--- Comment #4 from Andrew Pinski  ---
*** Bug 33748 has been marked as a duplicate of this bug. ***

[Bug c/33748] format warnings don't take input charset into account

2023-05-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=33748

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #2 from Andrew Pinski  ---
Dup of bug 20110.

*** This bug has been marked as a duplicate of bug 20110 ***

Re: [PATCH v2] RISC-V: Add bext pattern for ZBS

2023-05-19 Thread Jeff Law via Gcc-patches





On 5/8/23 08:11, Raphael Moreira Zinsly wrote:

Changes since v1:
 - Removed name clash change.
 - Fix new pattern indentation.

-- >8 --

When (a & (1 << bit_no)) is tested inside an IF we can use a bit extract.

gcc/ChangeLog:

* config/riscv/bitmanip.md
(branch_bext): New split pattern.

gcc/testsuite/ChangeLog:
* gcc.target/riscv/zbs-bext-02.c: New test.

I went ahead and pushed this.

jeff

[Bug middle-end/55279] New pseudo registers aren't supported in CSE

2023-05-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55279

--- Comment #10 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #3)
> I think combine was changed for the similar reason to support psedudos but I
> cannot find the patch right now.

Note combine was only fully fixed recently in GCC 12 with
r12-8030-g61bee6aed26eb3.

[Bug tree-optimization/106888] [RISCV] Negative optimization that excess andi instructions are generated in gcc.dg/pr90838.c

2023-05-19 Thread law at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106888

Jeffrey A. Law  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #12 from Jeffrey A. Law  ---
Should be fixed with Raphael's patch on the trunk.

Re: [PATCH v2] RISC-V: Fix CTZ unnecessary sign extension [PR #106888]

2023-05-19 Thread Jeff Law via Gcc-patches





On 5/8/23 08:12, Raphael Moreira Zinsly wrote:

Changes since v1:
- Remove subreg from operand 1.

-- >8 --

We were not able to match the CTZ sign extend pattern on RISC-V
because it gets optimized to zero extend and/or to ANDI patterns.
For the ANDI case, combine scrambles the RTL and generates the
extension by using subregs.

gcc/ChangeLog:
PR target/106888
* config/riscv/bitmanip.md
(disi2): Match with any_extend.
(disi2_sext): New pattern to match
with sign extend using an ANDI instruction.

gcc/testsuite/ChangeLog:
PR target/106888
* gcc.target/riscv/pr106888.c: New test.
* gcc.target/riscv/zbbw.c: Check for ANDI.

THanks.  I went ahead and retested this against the trunk and pushed it.

jeff

[Bug tree-optimization/106888] [RISCV] Negative optimization that excess andi instructions are generated in gcc.dg/pr90838.c

2023-05-19 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106888

--- Comment #11 from CVS Commits  ---
The master branch has been updated by Jeff Law :

https://gcc.gnu.org/g:9000da00dd70988f30d43806bae33b22ee6b9904

commit r14-1006-g9000da00dd70988f30d43806bae33b22ee6b9904
Author: Raphael Moreira Zinsly 
Date:   Fri May 19 20:54:34 2023 -0600

RISC-V: Fix CTZ unnecessary sign extension [PR #106888]

Changes since v1:
- Remove subreg from operand 1.

-- >8 --

We were not able to match the CTZ sign extend pattern on RISC-V
because it gets optimized to zero extend and/or to ANDI patterns.
For the ANDI case, combine scrambles the RTL and generates the
extension by using subregs.

gcc/ChangeLog:
PR target/106888
* config/riscv/bitmanip.md
(disi2): Match with any_extend.
(disi2_sext): New pattern to match
with sign extend using an ANDI instruction.

gcc/testsuite/ChangeLog:
PR target/106888
* gcc.target/riscv/pr106888.c: New test.
* gcc.target/riscv/zbbw.c: Check for ANDI.

[Bug middle-end/31271] Missing simple optimization

2023-05-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31271

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |4.7.0

--- Comment #3 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #2)
> 
> I think we could do slightly better
> ((~in_2(D)) & 224) == 0
> 
> But only at exand time.
> This gives:
> notl%edi
> xorl%eax, %eax
> testb   $-32, %dil
> setne   %al

x86_64 produces that in GCC 13 with r13-792-g29ae455901ac71 .

> 
> Or for aarch64:
> mov w8, #224
> bicswzr, w8, w0
> csetw0, ne
> ret

For aarch64, it could define an instruction to catch:
(set (reg:CC_NZV 66 cc)
(compare:CC_NZV (and:SI (not:SI (reg:SI 100))
(const_int 224 [0xe0]))
(const_int 0 [0])))


Anyways the original issue was fixed in GCC 4.7.0 and the small improvement for
x86_64 is in GCC 13. The aarch64 code generation is currently:
and w0, w0, 224
cmp w0, 224
csetw0, ne
ret

Which is only slightly worse than what I proposed too.

[Bug middle-end/31631] Folding of A & (1 << B) pessimizes FRE

2023-05-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31631

--- Comment #2 from Andrew Pinski  ---
We do a LIM before PRE now which allows PRE to handle it.

[Bug middle-end/100798] a?~t:t and (-(!!a))^t don't produce the same assembly code

2023-05-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100798

--- Comment #1 from Andrew Pinski  ---
To produce the same code we could do a match pattern:
(simplify
 (cond @0 (bit_not @1) @1)
 (bit_xor (neg (convert @0)) @1))

[Bug middle-end/64334] Common .opt handling: Support flags which take a list of values (-fopt=a,b,c ...)

2023-05-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64334

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||internal-improvement
   Target Milestone|--- |12.0

--- Comment #2 from Andrew Pinski  ---
EnumBitSet was added with r12-6842-g0ebb09f5e49c8c .
EnumSet/Set was added with r12-6839-g385196adb52d85 .

So fixed with GCC 12.

Note fsanitize= is still not using those for other reasons.

[PATCH 5/7] Simplify fold_single_bit_test with respect to code

2023-05-19 Thread Andrew Pinski via Gcc-patches

Since we know that fold_single_bit_test is now only passed
NE_EXPR or EQ_EXPR, we can simplify it and just use a gcc_assert
to assert that is the code that is being passed.

OK? Bootstrapped and tested on x86_64-linux.

gcc/ChangeLog:

* expr.cc (fold_single_bit_test): Add an assert
and simplify based on code being NE_EXPR or EQ_EXPR.
---
 gcc/expr.cc | 108 ++--
 1 file changed, 53 insertions(+), 55 deletions(-)

diff --git a/gcc/expr.cc b/gcc/expr.cc
index 67a9f82ca17..b5bc3fabb7e 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -12909,72 +12909,70 @@ fold_single_bit_test (location_t loc, enum tree_code 
code,
  tree inner, int bitnum,
  tree result_type)
 {
-  if ((code == NE_EXPR || code == EQ_EXPR))
-{
-  tree type = TREE_TYPE (inner);
-  scalar_int_mode operand_mode = SCALAR_INT_TYPE_MODE (type);
-  int ops_unsigned;
-  tree signed_type, unsigned_type, intermediate_type;
-  tree one;
-  gimple *inner_def;
+  gcc_assert (code == NE_EXPR || code == EQ_EXPR);
 
-  /* First, see if we can fold the single bit test into a sign-bit
-test.  */
-  if (bitnum == TYPE_PRECISION (type) - 1
- && type_has_mode_precision_p (type))
-   {
- tree stype = signed_type_for (type);
- return fold_build2_loc (loc, code == EQ_EXPR ? GE_EXPR : LT_EXPR,
- result_type,
- fold_convert_loc (loc, stype, inner),
- build_int_cst (stype, 0));
-   }
+  tree type = TREE_TYPE (inner);
+  scalar_int_mode operand_mode = SCALAR_INT_TYPE_MODE (type);
+  int ops_unsigned;
+  tree signed_type, unsigned_type, intermediate_type;
+  tree one;
+  gimple *inner_def;
 
-  /* Otherwise we have (A & C) != 0 where C is a single bit,
-convert that into ((A >> C2) & 1).  Where C2 = log2(C).
-Similarly for (A & C) == 0.  */
+  /* First, see if we can fold the single bit test into a sign-bit
+ test.  */
+  if (bitnum == TYPE_PRECISION (type) - 1
+  && type_has_mode_precision_p (type))
+{
+  tree stype = signed_type_for (type);
+  return fold_build2_loc (loc, code == EQ_EXPR ? GE_EXPR : LT_EXPR,
+ result_type,
+ fold_convert_loc (loc, stype, inner),
+ build_int_cst (stype, 0));
+}
 
-  /* If INNER is a right shift of a constant and it plus BITNUM does
-not overflow, adjust BITNUM and INNER.  */
-  if ((inner_def = get_def_for_expr (inner, RSHIFT_EXPR))
- && TREE_CODE (gimple_assign_rhs2 (inner_def)) == INTEGER_CST
- && bitnum < TYPE_PRECISION (type)
- && wi::ltu_p (wi::to_wide (gimple_assign_rhs2 (inner_def)),
-   TYPE_PRECISION (type) - bitnum))
-   {
- bitnum += tree_to_uhwi (gimple_assign_rhs2 (inner_def));
- inner = gimple_assign_rhs1 (inner_def);
-   }
+  /* Otherwise we have (A & C) != 0 where C is a single bit,
+ convert that into ((A >> C2) & 1).  Where C2 = log2(C).
+ Similarly for (A & C) == 0.  */
 
-  /* If we are going to be able to omit the AND below, we must do our
-operations as unsigned.  If we must use the AND, we have a choice.
-Normally unsigned is faster, but for some machines signed is.  */
-  ops_unsigned = (load_extend_op (operand_mode) == SIGN_EXTEND
- && !flag_syntax_only) ? 0 : 1;
+  /* If INNER is a right shift of a constant and it plus BITNUM does
+ not overflow, adjust BITNUM and INNER.  */
+  if ((inner_def = get_def_for_expr (inner, RSHIFT_EXPR))
+   && TREE_CODE (gimple_assign_rhs2 (inner_def)) == INTEGER_CST
+   && bitnum < TYPE_PRECISION (type)
+   && wi::ltu_p (wi::to_wide (gimple_assign_rhs2 (inner_def)),
+TYPE_PRECISION (type) - bitnum))
+{
+  bitnum += tree_to_uhwi (gimple_assign_rhs2 (inner_def));
+  inner = gimple_assign_rhs1 (inner_def);
+}
 
-  signed_type = lang_hooks.types.type_for_mode (operand_mode, 0);
-  unsigned_type = lang_hooks.types.type_for_mode (operand_mode, 1);
-  intermediate_type = ops_unsigned ? unsigned_type : signed_type;
-  inner = fold_convert_loc (loc, intermediate_type, inner);
+  /* If we are going to be able to omit the AND below, we must do our
+ operations as unsigned.  If we must use the AND, we have a choice.
+ Normally unsigned is faster, but for some machines signed is.  */
+  ops_unsigned = (load_extend_op (operand_mode) == SIGN_EXTEND
+ && !flag_syntax_only) ? 0 : 1;
 
-  if (bitnum != 0)
-   inner = build2 (RSHIFT_EXPR, intermediate_type,
-   inner, size_int (bitnum));
+  signed_type = lang_hooks.types.type_for_mode (operand_mode, 0);
+  unsigned_type = lang_hooks.types.type_for_mode (operand_mode, 1);
+  intermediate_type =

[PATCH 7/7] Expand directly for single bit test

2023-05-19 Thread Andrew Pinski via Gcc-patches

Instead of using creating trees to the expansion,
just expand directly which makes the code a little simplier
but also reduces how much GC memory will be used during the expansion.

OK? Bootstrapped and tested on x86_64-linux.

gcc/ChangeLog:

* expr.cc (fold_single_bit_test): Rename to ...
(expand_single_bit_test): This and expand directly.
(do_store_flag): Update for the rename function.
---
 gcc/expr.cc | 63 -
 1 file changed, 28 insertions(+), 35 deletions(-)

diff --git a/gcc/expr.cc b/gcc/expr.cc
index d04e8ed0204..6849c9627d0 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -12899,15 +12899,14 @@ maybe_optimize_sub_cmp_0 (enum tree_code code, tree 
*arg0, tree *arg1)
 }
 
 
-/* If CODE with arguments INNER & (1<

[PATCH 4/7] Simplify fold_single_bit_test slightly

2023-05-19 Thread Andrew Pinski via Gcc-patches

Now the only use of fold_single_bit_test is in do_store_flag,
we can change it such that to pass the inner arg and bitnum
instead of building a tree. There is no code generation changes
due to this change, only a decrease in GC memory that is produced
during expansion.

OK? Bootstrapped and tested on x86_64-linux.

gcc/ChangeLog:

* expr.cc (fold_single_bit_test): Take inner and bitnum
instead of arg0 and arg1. Update the code.
(do_store_flag): Don't create a tree when calling
fold_single_bit_test instead just call it with the bitnum
and the inner tree.
---
 gcc/expr.cc | 22 ++
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/gcc/expr.cc b/gcc/expr.cc
index a61772b6808..67a9f82ca17 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -12899,23 +12899,19 @@ maybe_optimize_sub_cmp_0 (enum tree_code code, tree 
*arg0, tree *arg1)
 }
 
 
-/* If CODE with arguments ARG0 and ARG1 represents a single bit
+/* If CODE with arguments INNER & (1<

[PATCH 6/7] Use BIT_FIELD_REF inside fold_single_bit_test

2023-05-19 Thread Andrew Pinski via Gcc-patches

Instead of depending on combine to do the extraction,
Let's create a tree which will expand directly into
the extraction. This improves code generation on some
targets.

OK? Bootstrapped and tested on x86_64-linux.

gcc/ChangeLog:

* expr.cc (fold_single_bit_test): Use BIT_FIELD_REF
instead of shift/and.
---
 gcc/expr.cc | 21 ++---
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/gcc/expr.cc b/gcc/expr.cc
index b5bc3fabb7e..d04e8ed0204 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -12957,22 +12957,21 @@ fold_single_bit_test (location_t loc, enum tree_code 
code,
   intermediate_type = ops_unsigned ? unsigned_type : signed_type;
   inner = fold_convert_loc (loc, intermediate_type, inner);
 
-  if (bitnum != 0)
-inner = build2 (RSHIFT_EXPR, intermediate_type,
-   inner, size_int (bitnum));
+  tree bftype = build_nonstandard_integer_type (1, 1);
+  int bitpos = bitnum;
 
-  one = build_int_cst (intermediate_type, 1);
+  if (BYTES_BIG_ENDIAN)
+bitpos = GET_MODE_BITSIZE (operand_mode) - 1 - bitpos;
 
-  if (code == EQ_EXPR)
-inner = fold_build2_loc (loc, BIT_XOR_EXPR, intermediate_type, inner, one);
+  inner = build3_loc (loc, BIT_FIELD_REF, bftype, inner,
+ bitsize_int (1), bitsize_int (bitpos));
 
-  /* Put the AND last so it can combine with more things.  */
-  inner = build2 (BIT_AND_EXPR, intermediate_type, inner, one);
+  one = build_int_cst (bftype, 1);
 
-  /* Make sure to return the proper type.  */
-  inner = fold_convert_loc (loc, result_type, inner);
+  if (code == EQ_EXPR)
+inner = fold_build2_loc (loc, BIT_XOR_EXPR, bftype, inner, one);
 
-  return inner;
+  return fold_convert_loc (loc, result_type, inner);
 }
 
 /* Generate code to calculate OPS, and exploded expression
-- 
2.17.1

[PATCH 3/7] Use get_def_for_expr in fold_single_bit_test

2023-05-19 Thread Andrew Pinski via Gcc-patches

The code in fold_single_bit_test, checks if
the inner was a right shift and improve the bitnum
based on that. But since the inner will always be a
SSA_NAME at this point, the code is dead. Move it over
to use the helper function get_def_for_expr instead.

OK? Bootstrapped and tested on x86_64-linux.

gcc/ChangeLog:

* expr.cc (fold_single_bit_test): Use get_def_for_expr
instead of checking the inner's code.
---
 gcc/expr.cc | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/gcc/expr.cc b/gcc/expr.cc
index 6221b6991c5..a61772b6808 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -12920,6 +12920,7 @@ fold_single_bit_test (location_t loc, enum tree_code 
code,
   int ops_unsigned;
   tree signed_type, unsigned_type, intermediate_type;
   tree one;
+  gimple *inner_def;
 
   /* First, see if we can fold the single bit test into a sign-bit
 test.  */
@@ -12939,14 +12940,14 @@ fold_single_bit_test (location_t loc, enum tree_code 
code,
 
   /* If INNER is a right shift of a constant and it plus BITNUM does
 not overflow, adjust BITNUM and INNER.  */
-  if (TREE_CODE (inner) == RSHIFT_EXPR
- && TREE_CODE (TREE_OPERAND (inner, 1)) == INTEGER_CST
+  if ((inner_def = get_def_for_expr (inner, RSHIFT_EXPR))
+ && TREE_CODE (gimple_assign_rhs2 (inner_def)) == INTEGER_CST
  && bitnum < TYPE_PRECISION (type)
- && wi::ltu_p (wi::to_wide (TREE_OPERAND (inner, 1)),
+ && wi::ltu_p (wi::to_wide (gimple_assign_rhs2 (inner_def)),
TYPE_PRECISION (type) - bitnum))
{
- bitnum += tree_to_uhwi (TREE_OPERAND (inner, 1));
- inner = TREE_OPERAND (inner, 0);
+ bitnum += tree_to_uhwi (gimple_assign_rhs2 (inner_def));
+ inner = gimple_assign_rhs1 (inner_def);
}
 
   /* If we are going to be able to omit the AND below, we must do our
-- 
2.17.1

[PATCH 2/7] Inline and simplify fold_single_bit_test_into_sign_test into fold_single_bit_test

2023-05-19 Thread Andrew Pinski via Gcc-patches

Since the last use of fold_single_bit_test is fold_single_bit_test,
we can inline it and even simplify the inlined version. This has
no behavior change.

OK? Bootstrapped and tested on x86_64-linux.

gcc/ChangeLog:

* expr.cc (fold_single_bit_test_into_sign_test): Inline into ...
(fold_single_bit_test): This and simplify.
---
 gcc/expr.cc | 51 ++-
 1 file changed, 10 insertions(+), 41 deletions(-)

diff --git a/gcc/expr.cc b/gcc/expr.cc
index f999f81af4a..6221b6991c5 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -12899,42 +12899,6 @@ maybe_optimize_sub_cmp_0 (enum tree_code code, tree 
*arg0, tree *arg1)
 }
 
 
-
-/* If CODE with arguments ARG0 and ARG1 represents a single bit
-   equality/inequality test, then return a simplified form of the test
-   using a sign testing.  Otherwise return NULL.  TYPE is the desired
-   result type.  */
-
-static tree
-fold_single_bit_test_into_sign_test (location_t loc,
-enum tree_code code, tree arg0, tree arg1,
-tree result_type)
-{
-  /* If this is testing a single bit, we can optimize the test.  */
-  if ((code == NE_EXPR || code == EQ_EXPR)
-  && TREE_CODE (arg0) == BIT_AND_EXPR && integer_zerop (arg1)
-  && integer_pow2p (TREE_OPERAND (arg0, 1)))
-{
-  /* If we have (A & C) != 0 where C is the sign bit of A, convert
-this into A < 0.  Similarly for (A & C) == 0 into A >= 0.  */
-  tree arg00 = sign_bit_p (TREE_OPERAND (arg0, 0), TREE_OPERAND (arg0, 1));
-
-  if (arg00 != NULL_TREE
- /* This is only a win if casting to a signed type is cheap,
-i.e. when arg00's type is not a partial mode.  */
- && type_has_mode_precision_p (TREE_TYPE (arg00)))
-   {
- tree stype = signed_type_for (TREE_TYPE (arg00));
- return fold_build2_loc (loc, code == EQ_EXPR ? GE_EXPR : LT_EXPR,
- result_type,
- fold_convert_loc (loc, stype, arg00),
- build_int_cst (stype, 0));
-   }
-}
-
-  return NULL_TREE;
-}
-
 /* If CODE with arguments ARG0 and ARG1 represents a single bit
equality/inequality test, then return a simplified form of
the test using shifts and logical operations.  Otherwise return
@@ -12955,14 +12919,19 @@ fold_single_bit_test (location_t loc, enum tree_code 
code,
   scalar_int_mode operand_mode = SCALAR_INT_TYPE_MODE (type);
   int ops_unsigned;
   tree signed_type, unsigned_type, intermediate_type;
-  tree tem, one;
+  tree one;
 
   /* First, see if we can fold the single bit test into a sign-bit
 test.  */
-  tem = fold_single_bit_test_into_sign_test (loc, code, arg0, arg1,
-result_type);
-  if (tem)
-   return tem;
+  if (bitnum == TYPE_PRECISION (type) - 1
+ && type_has_mode_precision_p (type))
+   {
+ tree stype = signed_type_for (type);
+ return fold_build2_loc (loc, code == EQ_EXPR ? GE_EXPR : LT_EXPR,
+ result_type,
+ fold_convert_loc (loc, stype, inner),
+ build_int_cst (stype, 0));
+   }
 
   /* Otherwise we have (A & C) != 0 where C is a single bit,
 convert that into ((A >> C2) & 1).  Where C2 = log2(C).
-- 
2.17.1

[PATCH 1/7] Move fold_single_bit_test to expr.cc from fold-const.cc

2023-05-19 Thread Andrew Pinski via Gcc-patches

This is part 1 of N patch set that will change the expansion
of `(A & C) != 0` from using trees to directly expanding so later
on we can do some cost analysis.

Since the only user of fold_single_bit_test is now
expand, move it to there.

OK? Bootstrapped and tested on x86_64-linux.

gcc/ChangeLog:

* fold-const.cc (fold_single_bit_test_into_sign_test): Move to
expr.cc.
(fold_single_bit_test): Likewise.
* expr.cc (fold_single_bit_test_into_sign_test): Move from fold-const.cc
(fold_single_bit_test): Likewise and make static.
* fold-const.h (fold_single_bit_test): Remove declaration.
---
 gcc/expr.cc   | 113 ++
 gcc/fold-const.cc | 112 -
 gcc/fold-const.h  |   1 -
 3 files changed, 113 insertions(+), 113 deletions(-)

diff --git a/gcc/expr.cc b/gcc/expr.cc
index 5ede094e705..f999f81af4a 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -12898,6 +12898,119 @@ maybe_optimize_sub_cmp_0 (enum tree_code code, tree 
*arg0, tree *arg1)
   *arg1 = treeop1;
 }
 
+
+
+/* If CODE with arguments ARG0 and ARG1 represents a single bit
+   equality/inequality test, then return a simplified form of the test
+   using a sign testing.  Otherwise return NULL.  TYPE is the desired
+   result type.  */
+
+static tree
+fold_single_bit_test_into_sign_test (location_t loc,
+enum tree_code code, tree arg0, tree arg1,
+tree result_type)
+{
+  /* If this is testing a single bit, we can optimize the test.  */
+  if ((code == NE_EXPR || code == EQ_EXPR)
+  && TREE_CODE (arg0) == BIT_AND_EXPR && integer_zerop (arg1)
+  && integer_pow2p (TREE_OPERAND (arg0, 1)))
+{
+  /* If we have (A & C) != 0 where C is the sign bit of A, convert
+this into A < 0.  Similarly for (A & C) == 0 into A >= 0.  */
+  tree arg00 = sign_bit_p (TREE_OPERAND (arg0, 0), TREE_OPERAND (arg0, 1));
+
+  if (arg00 != NULL_TREE
+ /* This is only a win if casting to a signed type is cheap,
+i.e. when arg00's type is not a partial mode.  */
+ && type_has_mode_precision_p (TREE_TYPE (arg00)))
+   {
+ tree stype = signed_type_for (TREE_TYPE (arg00));
+ return fold_build2_loc (loc, code == EQ_EXPR ? GE_EXPR : LT_EXPR,
+ result_type,
+ fold_convert_loc (loc, stype, arg00),
+ build_int_cst (stype, 0));
+   }
+}
+
+  return NULL_TREE;
+}
+
+/* If CODE with arguments ARG0 and ARG1 represents a single bit
+   equality/inequality test, then return a simplified form of
+   the test using shifts and logical operations.  Otherwise return
+   NULL.  TYPE is the desired result type.  */
+
+static tree
+fold_single_bit_test (location_t loc, enum tree_code code,
+ tree arg0, tree arg1, tree result_type)
+{
+  /* If this is testing a single bit, we can optimize the test.  */
+  if ((code == NE_EXPR || code == EQ_EXPR)
+  && TREE_CODE (arg0) == BIT_AND_EXPR && integer_zerop (arg1)
+  && integer_pow2p (TREE_OPERAND (arg0, 1)))
+{
+  tree inner = TREE_OPERAND (arg0, 0);
+  tree type = TREE_TYPE (arg0);
+  int bitnum = tree_log2 (TREE_OPERAND (arg0, 1));
+  scalar_int_mode operand_mode = SCALAR_INT_TYPE_MODE (type);
+  int ops_unsigned;
+  tree signed_type, unsigned_type, intermediate_type;
+  tree tem, one;
+
+  /* First, see if we can fold the single bit test into a sign-bit
+test.  */
+  tem = fold_single_bit_test_into_sign_test (loc, code, arg0, arg1,
+result_type);
+  if (tem)
+   return tem;
+
+  /* Otherwise we have (A & C) != 0 where C is a single bit,
+convert that into ((A >> C2) & 1).  Where C2 = log2(C).
+Similarly for (A & C) == 0.  */
+
+  /* If INNER is a right shift of a constant and it plus BITNUM does
+not overflow, adjust BITNUM and INNER.  */
+  if (TREE_CODE (inner) == RSHIFT_EXPR
+ && TREE_CODE (TREE_OPERAND (inner, 1)) == INTEGER_CST
+ && bitnum < TYPE_PRECISION (type)
+ && wi::ltu_p (wi::to_wide (TREE_OPERAND (inner, 1)),
+   TYPE_PRECISION (type) - bitnum))
+   {
+ bitnum += tree_to_uhwi (TREE_OPERAND (inner, 1));
+ inner = TREE_OPERAND (inner, 0);
+   }
+
+  /* If we are going to be able to omit the AND below, we must do our
+operations as unsigned.  If we must use the AND, we have a choice.
+Normally unsigned is faster, but for some machines signed is.  */
+  ops_unsigned = (load_extend_op (operand_mode) == SIGN_EXTEND
+ && !flag_syntax_only) ? 0 : 1;
+
+  signed_type = lang_hooks.types.type_for_mode (operand_mode, 0);
+  unsigned_type = lang_hooks.types.type_for_mode (operand_mode, 1);
+

[PATCH 0/7] Improve do_store_flag

2023-05-19 Thread Andrew Pinski via Gcc-patches

This patch set improves do_store_flag for the single bit case.
We go back to expanding the code directly rather than building some
trees. Plus instead of using shift+and we use directly bit_field
extraction; this improves code generation on avr.

Andrew Pinski (7):
  Move fold_single_bit_test to expr.cc from fold-const.cc
  Inline and simplify fold_single_bit_test_into_sign_test into
fold_single_bit_test
  Use get_def_for_expr in fold_single_bit_test
  Simplify fold_single_bit_test slightly
  Simplify fold_single_bit_test with respect to code
  Use BIT_FIELD_REF inside fold_single_bit_test
  Expand directly for single bit test

 gcc/expr.cc   |  91 -
 gcc/fold-const.cc | 112 --
 gcc/fold-const.h  |   1 -
 3 files changed, 81 insertions(+), 123 deletions(-)

-- 
2.17.1

[Bug rtl-optimization/46943] Unnecessary ZERO_EXTEND

2023-05-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=46943

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed|2018-04-22 00:00:00 |2023-5-19
   Severity|normal  |enhancement

[Bug middle-end/98961] Failure to optimize successive comparisons with 0 into clz

2023-05-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98961

--- Comment #5 from Andrew Pinski  ---
or could be a cost thing ...

[Bug middle-end/98961] Failure to optimize successive comparisons with 0 into clz

2023-05-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98961

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
  Component|rtl-optimization|middle-end
   Last reconfirmed||2023-05-20

--- Comment #4 from Andrew Pinski  ---
Confirmed, I think this should happen at expand time and only if the target
does not have conditional compares (e.g. like aarch64).

[Bug rtl-optimization/89680] Redundant moves with -march=skylake for long long shift on 32bit x86

2023-05-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89680

Andrew Pinski  changed:

   What|Removed |Added

  Known to work||10.1.0

--- Comment #2 from Andrew Pinski  ---
Looks like this was fixed in GCC 10.

[Bug tree-optimization/109287] Optimizing sal shr pairs when inlining function

2023-05-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109287

--- Comment #2 from Andrew Pinski  ---
Actually it is closer to:
unsigned f(unsigned t, unsigned b, unsigned *tt)
{
if (b >= 16) __builtin_unreachable();
t *= 16;
t+= b;
*tt = t%16;
unsigned ttt =  t/16;
return ttt;
}

As we know the range of b will be [0,15] due to the loop

[Bug tree-optimization/109287] Optimizing sal shr pairs when inlining function

2023-05-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109287

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2023-05-20
  Component|middle-end  |tree-optimization
   Severity|normal  |enhancement

--- Comment #1 from Andrew Pinski  ---
Reduced down to:
unsigned f(unsigned t, unsigned b, unsigned *tt)
{
t *= 16;
t+= b;
unsigned ttt =  t/16;
*tt = t%16;
return ttt;
}

Confirmed.

[Bug middle-end/108847] unnecessary bitwise AND on boolean types and shifting of the "sign" bit

2023-05-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108847

Andrew Pinski  changed:

   What|Removed |Added

 Target|x86_64-*-*  |x86_64-*-* aarch64-*-*
 Status|NEW |ASSIGNED

--- Comment #2 from Andrew Pinski  ---
I am messing around in this area

[Bug c/109912] #pragma GCC diagnostic ignored "-Wall" is ignored

2023-05-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109912

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||diagnostic
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-05-20

--- Comment #1 from Andrew Pinski  ---
So it is all of the "meta"-options which have this issue as shown by:
```
#pragma GCC diagnostic warning "-Wunused"
#pragma GCC diagnostic ignored "-Wunused"

static int f() {return 0;}
```

Confirmed.

[Bug middle-end/109907] Missed optimization for bit extraction (uses shift instead of single bit-test)

2023-05-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109907

--- Comment #9 from Andrew Pinski  ---
(In reply to Georg-Johann Lay from comment #6)
> Quite impressive improvement.  Maybe the last step can be achieved with a
> combiner pattern that combines extzv with a bit flip.
> 
> One problem is usually that there is no canonical form (sometimes
> zero_extract, sometimes shift+and, sometimes with subregs for extraction or
> paradoxical subregs for wider types, different behaviour for MSB, etc.).

Right, In this case combine tries:
(set (reg/i:QI 24 r24)
(zero_extract:QI (xor:QI (reg:QI 54)
(const_int 64 [0x40]))
(const_int 1 [0x1])
(const_int 6 [0x6])))

Which puts the xor inside the zero_extract even but I think you could handle
that once my patch set goes in.

[PATCH] Mode-Switching: Fix local array maybe uninitialized warning

2023-05-19 Thread Pan Li via Gcc-patches

From: Pan Li 

There are 2 local array in function optimize_mode_switching. It will be
initialized conditionally at the beginning but then always consumed in
another loop. It may trigger the warning maybe-uninitialized, and may
result in build failure when enable werror, aka warning as error.

This patch will initialize the local array to zero explictly when
declaration.

Signed-off-by: Pan Li 

gcc/ChangeLog:

* mode-switching.cc (entity_map): Initialize the array to zero.
(bb_info): Ditto.
---
 gcc/mode-switching.cc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/mode-switching.cc b/gcc/mode-switching.cc
index 2d2818f5674..64ae2bc29c3 100644
--- a/gcc/mode-switching.cc
+++ b/gcc/mode-switching.cc
@@ -499,8 +499,8 @@ optimize_mode_switching (void)
   bool need_commit = false;
   static const int num_modes[] = NUM_MODES_FOR_MODE_SWITCHING;
 #define N_ENTITIES ARRAY_SIZE (num_modes)
-  int entity_map[N_ENTITIES];
-  struct bb_info *bb_info[N_ENTITIES];
+  int entity_map[N_ENTITIES] = {};
+  struct bb_info *bb_info[N_ENTITIES] = {};
   int i, j;
   int n_entities = 0;
   int max_num_modes = 0;
-- 
2.34.1

[Bug tree-optimization/109038] Miss optimization to simplify bit_and + rotate to shift

2023-05-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109038

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-05-19
 Ever confirmed|0   |1
  Component|middle-end  |tree-optimization

--- Comment #2 from Andrew Pinski  ---
Confirmed.

(simplify
 (rrotate (bit_and @0 INTEGER_CST@1) INTEGER_CST@2)
 (if (@1 == (type)(~0) >> (typebits-@2))
  (lshift @0 { typebits - @2; }))

(simplify
 (lrotate (bit_and @0 INTEGER_CST@1) INTEGER_CST@2)
 (if (@1 == (type)(~0) >> (@2))
  (lshift @0 { @2; }))

There could be more dealing with the result being logical shift right.

[Bug target/55181] [10/11/12/13/14 Regression] Expensive shift loop where a bit-testing instruction could be used

2023-05-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55181

Andrew Pinski  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

--- Comment #26 from Andrew Pinski  ---
So I guess this is mine too.

With my patches I created to improve PR 109907 (attached there), the initial
RTL now looks like:
;; _9 = (unsigned char) _8;

(insn 6 5 0 (set (reg/v:QI 46 [  ])
(zero_extract:QI (subreg:QI (reg/v:SI 47 [ number ]) 3)
(const_int 1 [0x1])
(const_int 5 [0x5]))) "t2.c":4:6 -1
 (nil))

Where it was before:
;; _9 = (unsigned char) _8;

(insn 6 5 7 (set (reg:SI 48)
(lshiftrt:SI (reg/v:SI 47 [ number ])
(const_int 29 [0x1d]))) "t2.c":4:6 -1
 (nil))

(insn 7 6 0 (set (reg/v:QI 46 [  ])
(and:QI (subreg:QI (reg:SI 48) 0)
(const_int 1 [0x1]))) "t2.c":4:6 -1
 (nil))

Re: [V7][PATCH 1/2] Handle component_ref to a structre/union field including flexible array member [PR101832]

2023-05-19 Thread Bernhard Reutner-Fischer via Gcc-patches

On Fri, 19 May 2023 20:49:47 +
Qing Zhao via Gcc-patches  wrote:

> GCC extension accepts the case when a struct with a flexible array member
> is embedded into another struct or union (possibly recursively).

Do you mean TYPE_TRAILING_FLEXARRAY()?

> diff --git a/gcc/tree.h b/gcc/tree.h
> index 0b72663e6a1..237644e788e 100644
> --- a/gcc/tree.h
> +++ b/gcc/tree.h
> @@ -786,7 +786,12 @@ extern void omp_clause_range_check_failed (const_tree, 
> const char *, int,
> (...) prototype, where arguments can be accessed with va_start and
> va_arg), as opposed to an unprototyped function.  */
>  #define TYPE_NO_NAMED_ARGS_STDARG_P(NODE) \
> -  (TYPE_CHECK (NODE)->type_common.no_named_args_stdarg_p)
> +  (FUNC_OR_METHOD_CHECK (NODE)->type_common.no_named_args_stdarg_p)
> +
> +/* True if this RECORD_TYPE or UNION_TYPE includes a flexible array member
> +   at the last field recursively.  */
> +#define TYPE_INCLUDE_FLEXARRAY(NODE) \
> +  (RECORD_OR_UNION_CHECK (NODE)->type_common.no_named_args_stdarg_p)

Until i read the description above i read TYPE_INCLUDE_FLEXARRAY as an
option to include or not include something. The description hints more
at TYPE_INCLUDES_FLEXARRAY (with an S) to be a type which has at least
one member which has a trailing flexible array or which itself has a
trailing flexible array.

>  
>  /* In an IDENTIFIER_NODE, this means that assemble_name was called with
> this string as an argument.  */

[Bug objc/109913] [14 regression] r14-976-g9907413a3a6aa3 causes more than 300 objc/objc++ failures

2023-05-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109913

--- Comment #3 from Andrew Pinski  ---
Note for powerpc-darwin, VECTOR_TYPE_P  might need to be defined too.

[Bug c++/99451] [plugin] cannot enable specific dump for plugin passes

2023-05-19 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99451

--- Comment #2 from CVS Commits  ---
The trunk branch has been updated by Nathan Sidwell :

https://gcc.gnu.org/g:97a36b466ba1420210294f0a1dd7002054ba3b7e

commit r14-1004-g97a36b466ba1420210294f0a1dd7002054ba3b7e
Author: Nathan Sidwell 
Date:   Wed May 17 19:27:13 2023 -0400

Allow plugin dumps

Defer dump option parsing until plugins are initialized.  This allows one
to
use plugin names for dumps.

PR other/99451
gcc/
* opts.h (handle_deferred_dump_options): Declare.
* opts-global.cc (handle_common_deferred_options): Do not handle
dump options here.
(handle_deferred_dump_options): New.
* toplev.cc (toplev::main): Call it after plugin init.

[Bug objc/109913] [14 regression] r14-976-g9907413a3a6aa3 causes more than 300 objc/objc++ failures

2023-05-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109913

--- Comment #2 from Andrew Pinski  ---
Created attachment 55123
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55123=edit
Patch to test

Does this patch work? If so assign it to me and I will apply it.

[Bug objc/109913] [14 regression] r14-976-g9907413a3a6aa3 causes more than 300 objc/objc++ failures

2023-05-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109913

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |14.0

gcc-12-20230519 is now available

2023-05-19 Thread GCC Administrator via Gcc

Snapshot gcc-12-20230519 is now available on
  https://gcc.gnu.org/pub/gcc/snapshots/12-20230519/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 12 git branch
with the following options: git://gcc.gnu.org/git/gcc.git branch 
releases/gcc-12 revision a4d13e54822a4a53137c9f5e23770a798a0b

You'll find:

 gcc-12-20230519.tar.xz   Complete GCC

  SHA256=64fb521d2d038412618b78a00b2bbe74328e6e3ab8af8afbb88991afea74300e
  SHA1=6c870d3256a6c9fa566114922397f43eeb1d24ab

Diffs from 12-20230512 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-12
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.

[Bug objc/109913] [14 regression] r14-976-g9907413a3a6aa3 causes more than 300 objc/objc++ failures

2023-05-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109913

--- Comment #1 from Andrew Pinski  ---
The problem is ROUND_TYPE_ALIGN is used in libobjc and then
RECORD_OR_UNION_TYPE_P is not defined there ...

[Bug middle-end/21161] [10/11/12/13/14 Regression] "clobbered by longjmp" warning ignores the data flow

2023-05-19 Thread eggert at cs dot ucla.edu via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=21161

Paul Eggert  changed:

   What|Removed |Added

 CC||eggert at cs dot ucla.edu

--- Comment #26 from Paul Eggert  ---
Created attachment 55122
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55122=edit
GCC bug 21161 as triggered by GNU diffutils

I ran into a similar problem when compiling GNU diffutils with gcc (GCC) 13.1.1
20230511 (Red Hat 13.1.1-2) on x86.64. Here is a stripped-down illustrating of
the diffutils problem. Compile the attached program with:

gcc -O2 -W -S pr21161.i

The output, which is a false positive, is:

pr21161.i: In function ‘find_dir_file_pathname’:
pr21161.i:22:15: warning: variable ‘match’ might be clobbered by ‘longjmp’ or
‘vfork’ [-Wclobbered]
   22 |   char const *match = file;
  |   ^

Re: [PATCH 1/2] Improve do_store_flag for single bit comparison against 0

2023-05-19 Thread Andrew Pinski via Gcc-patches

On Fri, May 19, 2023 at 9:40 AM Jeff Law via Gcc-patches
 wrote:
>
>
>
> On 5/18/23 20:14, Andrew Pinski via Gcc-patches wrote:
> > While working something else, I noticed we could improve
> > the following function code generation:
> > ```
> > unsigned f(unsigned t)
> > {
> >if (t & ~(1<<30)) __builtin_unreachable();
> >return t != 0;
> > }
> > ```
> > Right know we just emit a comparison against 0 instead
> > of just a shift right by 30.
> > There is code in do_store_flag which already optimizes
> > `(t & 1<<30) != 0` to `(t >> 30) & 1`. This patch
> > extends it to handle the case where we know t has a
> > nonzero of just one bit set.
> >
> > OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
> >
> > gcc/ChangeLog:
> >
> >   * expr.cc (do_store_flag): Extend the one bit checking case
> >   to handle the case where we don't have an and but rather still
> >   one bit is known to be non-zero.
> So as we touched on in IRC, the concern is targets where the cost of the
> shift depends on the number of bits shifted.  Can we look at costing
> here to determine the initial RTL generation approach?
>
> Another approach that would work for some targets is a single bit
> extract.  In theory we should be discovering the extract idiom from the
> shift+and form, but I'm always concerned that it's going to be missed
> for one or more oddball reasons.

I now have a patch set which does the extraction directly rather than having
combine try to combine it later on. This actually fixes an issue with avr target
which expands out the shift by doing a loop. Since we are using
extract_bit_field,
if a target does not have an extract pattern, it will expand using
shift+and form instead.
I will resubmit this and the other patch after this new patch set is completed.

Thanks,
Andrew Pinski

>
> jeff
>

[Bug c/91093] Error on implicit int by default

2023-05-19 Thread muecker at gwdg dot de via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91093

Martin Uecker  changed:

   What|Removed |Added

 CC||muecker at gwdg dot de

--- Comment #3 from Martin Uecker  ---
*** Bug 106425 has been marked as a duplicate of this bug. ***

[Bug c/106425] implicit-int

2023-05-19 Thread muecker at gwdg dot de via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106425

Martin Uecker  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|NEW |RESOLVED

--- Comment #3 from Martin Uecker  ---

Duplicate.

*** This bug has been marked as a duplicate of bug 91093 ***

[Bug tree-optimization/101770] -Wmaybe-uninitialized false alarm with only locals in GNU diffutils

2023-05-19 Thread eggert at cs dot ucla.edu via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101770

--- Comment #5 from Paul Eggert  ---
I can no longer reproduce the bug in bleeding-edge GNU diffutils, so this bug
is not so important in its own right - that is, it's merely that GCC 13.1.1
still mishandles w.i.

Re: [PATCH] nvptx: Add suppport for __builtin_nvptx_brev instrinsic.

2023-05-19 Thread Jeff Law via Gcc-patches





On 5/6/23 10:04, Roger Sayle wrote:
  


This patch adds support for (a pair of) bit reversal intrinsics

__builtin_nvptx_brev and __builtin_nvptx_brevll which perform 32-bit

and 64-bit bit reversal (using nvptx's brev instruction) matching

the __brev and __brevll instrinsics provided by NVidia's nvcc compiler.

https://docs.nvidia.com/cuda/cuda-math-api/group__CUDA__MATH__INTRINSIC__INT
.html

  


This patch has been tested on nvptx-none which make and make -k check

with no new failures.  Ok for mainline?

  

  


2023-05-06  Roger Sayle  

  


gcc/ChangeLog

 * config/nvptx/nvptx.cc (nvptx_expand_brev): Expand target

 builtin for bit reversal using brev instruction.

 (enum nvptx_builtins): Add NVPTX_BUILTIN_BREV and

 NVPTX_BUILTIN_BREVLL.

 (nvptx_init_builtins): Define "brev" and "brevll".

 (nvptx_expand_builtin): Expand NVPTX_BUILTIN_BREV and

 NVPTX_BUILTIN_BREVLL via nvptx_expand_brev function.

 * doc/extend.texi (Nvidia PTX Builtin-in Functions): New

 section, document __builtin_nvptx_brev{,ll}.

  


gcc/testsuite/ChangeLog

 * gcc.target/nvptx/brev-1.c: New 32-bit test case.

 * gcc.target/nvptx/brev-2.c: Likewise.

 * gcc.target/nvptx/brevll-1.c: New 64-bit test case.

 * gcc.target/nvptx/brevll-2.c: Likewise.

OK
jeff

Re: [PATCH] Only use NO_REGS in cost calculation when !hard_regno_mode_ok for GENERAL_REGS and mode.

2023-05-19 Thread Jeff Law via Gcc-patches





On 5/17/23 00:57, liuhongt via Gcc-patches wrote:

r14-172-g0368d169492017 replaces GENERAL_REGS with NO_REGS in cost
calculation when the preferred register class are not known yet.
It regressed powerpc PR109610 and PR109858, it looks too aggressive to use
NO_REGS when mode can be allocated with GENERAL_REGS.
The patch takes a step back, still use GENERAL_REGS when
hard_regno_mode_ok for mode and GENERAL_REGS, otherwise uses NO_REGS.
Kewen confirmed the patch fixed PR109858, I vefiried it also fixed PR109610.

Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
No big performance impact for SPEC2017 on icelake server.
Ok for trunk?

gcc/ChangeLog:

* ira-costs.cc (scan_one_insn): Only use NO_REGS in cost
calculation when !hard_regno_mode_ok for GENERAL_REGS and
mode, otherwise still use GENERAL_REGS.
BTW, Vlad is on PTO right now.  I'm sure he'll handle this after he 
returns and starts digging out of all the stuff that's piled up.


jeff

Re: [PATCH] configure: Implement --enable-host-bind-now

2023-05-19 Thread Jeff Law via Gcc-patches





On 5/16/23 09:37, Marek Polacek via Gcc-patches wrote:

As promised in the --enable-host-pie patch, this patch adds another
configure option, --enable-host-bind-now, which adds -z now when linking
the compiler executables in order to extend hardening.  BIND_NOW with RELRO
allows the GOT to be marked RO; this prevents GOT modification attacks.

This option does not affect linking of target libraries; you can use
LDFLAGS_FOR_TARGET=-Wl,-z,relro,-z,now to enable RELRO/BIND_NOW.

With this patch:
$ readelf -Wd cc1{,plus} | grep FLAGS
  0x001e (FLAGS)  BIND_NOW
  0x6ffb (FLAGS_1)Flags: NOW PIE
  0x001e (FLAGS)  BIND_NOW
  0x6ffb (FLAGS_1)Flags: NOW PIE

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

c++tools/ChangeLog:

* configure.ac (--enable-host-bind-now): New check.
* configure: Regenerate.

gcc/ChangeLog:

* configure.ac (--enable-host-bind-now): New check.  Add
-Wl,-z,now to LD_PICFLAG if --enable-host-bind-now.
* configure: Regenerate.
* doc/install.texi: Document --enable-host-bind-now.

lto-plugin/ChangeLog:

* configure.ac (--enable-host-bind-now): New check.  Link with
-z,now.
* configure: Regenerate.

OK
jeff

[Bug middle-end/24639] [meta-bug] bug to track all Wuninitialized issues

2023-05-19 Thread eggert at cs dot ucla.edu via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=24639
Bug 24639 depends on bug 101770, which changed state.

Bug 101770 Summary: -Wmaybe-uninitialized false alarm with only locals in GNU 
diffutils
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101770

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|FIXED   |---

[Bug tree-optimization/101770] -Wmaybe-uninitialized false alarm with only locals in GNU diffutils

2023-05-19 Thread eggert at cs dot ucla.edu via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101770

Paul Eggert  changed:

   What|Removed |Added

 Resolution|FIXED   |---
 Status|RESOLVED|REOPENED
Version|11.2.1  |13.1.1

--- Comment #4 from Paul Eggert  ---
I seeing the bug with gcc (GCC) 13.1.1 20230511 (Red Hat 13.1.1-2) on x86-64
when compiling GNU diffutils, so although the bug was reported fixed on the
trunk last year, it appears that the fix hasn't propagated GCC 13 despite the
Target Milestone being 13.0.

The symptoms are:

$ gcc -O2 -Wmaybe-uninitialized -S w.i
w.i: In function ‘edit’:
w.i:50:18: warning: ‘cmd1’ may be used uninitialized [-Wmaybe-uninitialized]
   50 |   return !cmd1;
  |  ^
w.i:7:11: note: ‘cmd1’ was declared here
7 |   int cmd1;
  |   ^~~~

This appears to be the same bug as before so I am taking the liberty of
reopening the bug report.

Re: [V7][PATCH 2/2] Update documentation to clarify a GCC extension [PR77650]

2023-05-19 Thread Joseph Myers

On Fri, 19 May 2023, Qing Zhao via Gcc-patches wrote:

> +GCC extension accepts a structure containing an ISO C99 @dfn{flexible array

"The GCC extension" or "A GCC extension".

> +@item
> +A structure containing a C99 flexible array member, or a union containing
> +such a structure, is the middle field of another structure, for example:

There might be more than one middle field, and I think this case also 
includes where it's the *first* field - any field other than the last.

> +@smallexample
> +struct flex  @{ int length; char data[]; @};
> +
> +struct mid_flex @{ int m; struct flex flex_data; int n; @};
> +@end smallexample
> +
> +In the above, @code{mid_flex.flex_data.data[]} has undefined behavior.

And it's not literally mid_flex.flex_data.data[] that has undefined 
behavior, but trying to access a member of that array.

> +Compilers do not handle such case consistently, Any code relying on

"such a case", and "," should be "." at the end of a sentence.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [C PATCH] Remove dead code related to type compatibility across TUs.

2023-05-19 Thread Joseph Myers

On Fri, 19 May 2023, Martin Uecker via Gcc-patches wrote:

> Repost for stage 1.
> 
> 
> C: Remove dead code related to type compatibility across TUs.
> 
> Code to detect struct/unions across the same TU is not needed
> anymore. Code for determining compatibility of tagged types is
> preserved as it will be used for C2X. Some errors in the unused
> code are fixed.
> 
> Bootstrapped with no regressions for x86_64-pc-linux-gnu.
> 
> gcc/c/
> * c-decl.cc (set_type_context): Remove.
> (pop_scope, diagnose_mismatched_decls, pushdecl):
> Remove dead code.
> * c-typeck.cc (comptypes_internal): Remove dead code.
> (same_translation_unit_p): Remove.
> (tagged_types_tu_compatible_p): Some fixes.

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com

[Bug objc/109913] New: [14 regression] r14-976-g9907413a3a6aa3 causes more than 300 objc/objc++ failures

2023-05-19 Thread seurer at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109913

Bug ID: 109913
   Summary: [14 regression] r14-976-g9907413a3a6aa3 causes more
than 300 objc/objc++ failures
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: objc
  Assignee: unassigned at gcc dot gnu.org
  Reporter: seurer at gcc dot gnu.org
  Target Milestone: ---

g:9907413a3a6aa30a4a6db4756c445b40f04597f3, r14-976-g9907413a3a6aa3


commit 9907413a3a6aa30a4a6db4756c445b40f04597f3 (HEAD)
Author: Bernhard Reutner-Fischer 
Date:   Sun May 14 00:38:33 2023 +0200

gcc/config/*: use _P() defines from tree.h


FAIL: obj-c++.dg/basic.mm -fgnu-runtime (test for excess errors)
FAIL: obj-c++.dg/bitfield-1.mm -fgnu-runtime (test for excess errors)
FAIL: obj-c++.dg/bitfield-2.mm -fgnu-runtime (test for excess errors)
FAIL: obj-c++.dg/bitfield-4.mm -fgnu-runtime (test for excess errors)
FAIL: obj-c++.dg/cxx-ivars-1.mm -fgnu-runtime (test for excess errors)
FAIL: obj-c++.dg/cxx-scope-1.mm -fgnu-runtime (test for excess errors)
FAIL: obj-c++.dg/defs.mm -fgnu-runtime (test for excess errors)
FAIL: obj-c++.dg/demangle-1.mm -fgnu-runtime (test for excess errors)
FAIL: obj-c++.dg/demangle-2.mm -fgnu-runtime (test for excess errors)
FAIL: obj-c++.dg/encode-10.mm -fgnu-runtime (test for excess errors)
FAIL: obj-c++.dg/encode-3.mm -fgnu-runtime (test for excess errors)
FAIL: obj-c++.dg/encode-4.mm -fgnu-runtime (test for excess errors)
FAIL: obj-c++.dg/encode-5.mm -fgnu-runtime (test for excess errors)
FAIL: obj-c++.dg/encode-6.mm -fgnu-runtime (test for excess errors)
FAIL: obj-c++.dg/encode-9.mm -fgnu-runtime (test for excess errors)
FAIL: obj-c++.dg/except-1.mm -fgnu-runtime (test for excess errors)
FAIL: obj-c++.dg/gnu-api-2-class-meta.mm -fgnu-runtime (test for excess errors)
FAIL: obj-c++.dg/gnu-api-2-class.mm -fgnu-runtime (test for excess errors)
FAIL: obj-c++.dg/gnu-api-2-ivar.mm -fgnu-runtime (test for excess errors)
FAIL: obj-c++.dg/gnu-api-2-method.mm -fgnu-runtime (test for excess errors)
FAIL: obj-c++.dg/gnu-api-2-objc.mm -fgnu-runtime (test for excess errors)
FAIL: obj-c++.dg/gnu-api-2-objc_msg_lookup.mm -fgnu-runtime (test for excess
errors)
FAIL: obj-c++.dg/gnu-api-2-object.mm -fgnu-runtime (test for excess errors)
FAIL: obj-c++.dg/gnu-api-2-property.mm -fgnu-runtime (test for excess errors)
FAIL: obj-c++.dg/gnu-api-2-protocol.mm -fgnu-runtime (test for excess errors)
FAIL: obj-c++.dg/gnu-api-2-resolve-method.mm -fgnu-runtime (test for excess
errors)
FAIL: obj-c++.dg/gnu-api-2-sel.mm -fgnu-runtime (test for excess errors)
FAIL: obj-c++.dg/gnu-runtime-3.mm -fgnu-runtime (test for excess errors)
FAIL: obj-c++.dg/lookup-2.mm -fgnu-runtime (test for excess errors)
FAIL: obj-c++.dg/lto/trivial-1
obj_cpp_lto_trivial-1_0.o-obj_cpp_lto_trivial-1_0.o link, -O0 -flto
-fgnu-runtime -Wno-objc-root-class
FAIL: obj-c++.dg/lto/trivial-1
obj_cpp_lto_trivial-1_0.o-obj_cpp_lto_trivial-1_0.o link, -O0 -flto
-flto-partition=none -fgnu-runtime -Wno-objc-root-class
FAIL: obj-c++.dg/lto/trivial-1
obj_cpp_lto_trivial-1_0.o-obj_cpp_lto_trivial-1_0.o link, -O2 -flto
-fgnu-runtime -Wno-objc-root-class
FAIL: obj-c++.dg/lto/trivial-1
obj_cpp_lto_trivial-1_0.o-obj_cpp_lto_trivial-1_0.o link, -O2 -flto
-flto-partition=none -fgnu-runtime -Wno-objc-root-class
FAIL: obj-c++.dg/method-10.mm -fgnu-runtime (test for excess errors)
FAIL: obj-c++.dg/method-17.mm -fgnu-runtime (test for excess errors)
FAIL: obj-c++.dg/method-19.mm -fgnu-runtime (test for excess errors)
FAIL: obj-c++.dg/method-22.mm -fgnu-runtime (test for excess errors)
FAIL: obj-c++.dg/method-23.mm -fgnu-runtime (test for excess errors)
FAIL: obj-c++.dg/property/at-property-10.mm -fgnu-runtime -Wno-objc-root-class
(test for excess errors)
FAIL: obj-c++.dg/property/at-property-11.mm -fgnu-runtime -Wno-objc-root-class
(test for excess errors)
FAIL: obj-c++.dg/property/at-property-12.mm -fgnu-runtime -Wno-objc-root-class
(test for excess errors)
FAIL: obj-c++.dg/property/at-property-13.mm -fgnu-runtime -Wno-objc-root-class
(test for excess errors)
FAIL: obj-c++.dg/property/at-property-19.mm -fgnu-runtime -Wno-objc-root-class
(test for excess errors)
FAIL: obj-c++.dg/property/at-property-22.mm -fgnu-runtime -Wno-objc-root-class
(test for excess errors)
FAIL: obj-c++.dg/property/at-property-24.mm -fgnu-runtime -Wno-objc-root-class
(test for excess errors)
FAIL: obj-c++.dg/property/at-property-26.mm -fgnu-runtime -Wno-objc-root-class
(test for excess errors)
FAIL: obj-c++.dg/property/at-property-27.mm -fgnu-runtime -Wno-objc-root-class
(test for excess errors)
FAIL: obj-c++.dg/property/at-property-6.mm -fgnu-runtime -Wno-objc-root-class
(test for excess errors)
FAIL: obj-c++.dg/property/at-property-7.mm -fgnu-runtime -Wno-objc-root-class
(test for excess errors)
FAIL: obj-c++.dg/property/at-property-8.mm -fgnu-runtime -Wno-objc-root-class
(test for excess errors)
FAIL:

[V7][PATCH 2/2] Update documentation to clarify a GCC extension [PR77650]

2023-05-19 Thread Qing Zhao via Gcc-patches

on a structure with a C99 flexible array member being nested in
another structure.

"GCC extension accepts a structure containing an ISO C99 "flexible array
member", or a union containing such a structure (possibly recursively)
to be a member of a structure.

 There are two situations:

   * A structure containing a C99 flexible array member, or a union
 containing such a structure, is the last field of another structure,
 for example:

  struct flex  { int length; char data[]; };
  union union_flex { int others; struct flex f; };

  struct out_flex_struct { int m; struct flex flex_data; };
  struct out_flex_union { int n; union union_flex flex_data; };

 In the above, both 'out_flex_struct.flex_data.data[]' and
 'out_flex_union.flex_data.f.data[]' are considered as flexible
 arrays too.

   * A structure containing a C99 flexible array member, or a union
 containing such a structure, is the middle field of another structure,
 for example:

  struct flex  { int length; char data[]; };

  struct mid_flex { int m; struct flex flex_data; int n; };

 In the above, 'mid_flex.flex_data.data[]' has undefined behavior.
 Compilers do not handle such case consistently, Any code relying on
 such case should be modified to ensure that flexible array members
 only end up at the ends of structures.

 Please use warning option '-Wflex-array-member-not-at-end' to
 identify all such cases in the source code and modify them.  This
 warning will be on by default starting from GCC 15.
"

gcc/c-family/ChangeLog:

* c.opt: New option -Wflex-array-member-not-at-end.

gcc/c/ChangeLog:

* c-decl.cc (finish_struct): Issue warnings for new option.

gcc/ChangeLog:

* doc/extend.texi: Document GCC extension on a structure containing
a flexible array member to be a member of another structure.

gcc/testsuite/ChangeLog:

* gcc.dg/variable-sized-type-flex-array.c: New test.
---
 gcc/c-family/c.opt|  5 +++
 gcc/c/c-decl.cc   |  9 
 gcc/doc/extend.texi   | 45 ++-
 .../gcc.dg/variable-sized-type-flex-array.c   | 31 +
 4 files changed, 89 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/variable-sized-type-flex-array.c

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index cddeece..c26d9801b63 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -737,6 +737,11 @@ Wformat-truncation=
 C ObjC C++ LTO ObjC++ Joined RejectNegative UInteger Var(warn_format_trunc) 
Warning LangEnabledBy(C ObjC C++ LTO ObjC++,Wformat=, warn_format >= 1, 0) 
IntegerRange(0, 2)
 Warn about calls to snprintf and similar functions that truncate output.
 
+Wflex-array-member-not-at-end
+C C++ Var(warn_flex_array_member_not_at_end) Warning
+Warn when a structure containing a C99 flexible array member as the last
+field is not at the end of another structure.
+
 Wif-not-aligned
 C ObjC C++ ObjC++ Var(warn_if_not_aligned) Init(1) Warning
 Warn when the field in a struct is not aligned.
diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index 2c620b681d9..9a48f28788d 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -9293,6 +9293,15 @@ finish_struct (location_t loc, tree t, tree fieldlist, 
tree attributes,
TYPE_INCLUDE_FLEXARRAY (t)
  = is_last_field && TYPE_INCLUDE_FLEXARRAY (TREE_TYPE (x));
 
+  if (warn_flex_array_member_not_at_end
+ && !is_last_field
+ && RECORD_OR_UNION_TYPE_P (TREE_TYPE (x))
+ && TYPE_INCLUDE_FLEXARRAY (TREE_TYPE (x)))
+   warning_at (DECL_SOURCE_LOCATION (x),
+   OPT_Wflex_array_member_not_at_end,
+   "structure containing a flexible array member"
+   " is not at the end of another structure");
+
   if (DECL_NAME (x)
  || RECORD_OR_UNION_TYPE_P (TREE_TYPE (x)))
saw_named_field = true;
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index ed8b9c8a87b..6425ba57e88 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -1751,7 +1751,50 @@ Flexible array members may only appear as the last 
member of a
 A structure containing a flexible array member, or a union containing
 such a structure (possibly recursively), may not be a member of a
 structure or an element of an array.  (However, these uses are
-permitted by GCC as extensions.)
+permitted by GCC as extensions, see details below.)
+@end itemize
+
+GCC extension accepts a structure containing an ISO C99 @dfn{flexible array
+member}, or a union containing such a structure (possibly recursively)
+to be a member of a structure.
+
+There are two situations:
+
+@itemize @bullet
+@item
+A structure containing a C99 flexible array member, or a union containing
+such a structure, is the last field of another structure, for example:
+
+@smallexample
+struct flex  @{ int length; char data[];

[Bug testsuite/101528] [11 regression] gcc.target/powerpc/int_128bit-runnable.c fails after r11-8743

2023-05-19 Thread cel at us dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101528

Carl Love  changed:

   What|Removed |Added

 CC||cel at us dot ibm.com

--- Comment #6 from Carl Love  ---
I will look into this and see if the instruction counts have changed for some
reason.

[Bug c++/109876] [10/11/12/13/14 Regression] initializer_list not usable in constant expressions in a template

2023-05-19 Thread mpolacek at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109876

--- Comment #7 from Marek Polacek  ---
// PR c++/109876

using size_t = decltype(sizeof 0);

namespace std {
template  struct initializer_list {
  const int *_M_array;
  size_t _M_len;
  constexpr size_t size() const { return _M_len; }
};
} // namespace std

template  struct Array {};
template  void g()
{
  static constexpr std::initializer_list num{2};
  Array ctx;
}

[Bug middle-end/109907] Missed optimization for bit extraction (uses shift instead of single bit-test)

2023-05-19 Thread gjl at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109907

--- Comment #8 from Georg-Johann Lay  ---
avr.md has this:

> ;; ??? do_store_flag emits a hard-coded right shift to extract a bit without
> ;; even considering rtx_costs, extzv, or a bit-test.  See PR55181 for an 
> example.

And I already tried to work around it in that PR, but forgot about it...

[Bug middle-end/109907] Missed optimization for bit extraction (uses shift instead of single bit-test)

2023-05-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109907

--- Comment #7 from Andrew Pinski  ---
(In reply to Georg-Johann Lay from comment #6) 
> (define_expand "extzv"
>   [(set (match_operand:QI 0 "register_operand")
> (zero_extract:QI (match_operand:QI 1 "register_operand")
>  (match_operand:QI 2 "const1_operand")
>  (match_operand:QI 3 "const_0_to_7_operand")))])
> 
> Maybe QI for op1 is not optimal, but it's not possible to use mode iterator
> because there's only one gen_extzv.  Dunno if VOIDmode would help or is sane.

Note extzv pattern has been deprecate since 4.8 with r0-120368-gd2eeb2d179a435
which added extzv and co as being supported. So maybe moving over to
using that instead on avr backend might help here ...

[Bug c/70418] VM structure type specifier in list of parameter declarations within nested function definition ices.

2023-05-19 Thread muecker at gwdg dot de via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70418

--- Comment #8 from Martin Uecker  ---
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/618911.html

Re: [C PATCH v2] Fix ICEs related to VM types in C [PR106465, PR107557, PR108423, PR109450]

2023-05-19 Thread Joseph Myers

On Fri, 19 May 2023, Martin Uecker via Gcc-patches wrote:

> Thanks Joseph! 
> 
> Revised version attached. Ok?

The C front-end changes and tests are OK.

> But I wonder whether we generally need to do something 
> about
> 
>   sizeof *x
> 
> when x is NULL or not initialized. This is quite commonly
> used in C code and if the type is not of variable size,
> it is also unproblematic.  So the UB for variable size is
> unfortunate and certainly also affects existing code in
> the wild.  In practice it does not seem to cause
> problems because there is no lvalue conversion and this
> then seems to work.  Maybe we document this as an 
> extension?  (and make sure in the C FE that it
> works)  This would also make this idiom valid:

There's certainly a tricky question of what exactly it means to evaluate 
*x as far as producing an lvalue but without converting it to an rvalue - 
but right now the C standard wording on unary '*' is clear that "if it 
points to an object, the result is an lvalue designating the object" and 
"If an invalid value has been assigned to the pointer, the behavior of the 
unary * operator is undefined.", i.e. it's the evaluation as far as 
producing an lvalue that produces undefined behavior, rather than the 
lvalue conversion (that doesn't happen in sizeof) that does so.  And 
indeed we probably would be able to define semantics that avoid UB if 
desired.

-- 
Joseph S. Myers
jos...@codesourcery.com

[Bug c/70418] VM structure type specifier in list of parameter declarations within nested function definition ices.

2023-05-19 Thread muecker at gwdg dot de via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70418

Martin Uecker  changed:

   What|Removed |Added

 CC||muecker at gwdg dot de

--- Comment #7 from Martin Uecker  ---
*** Bug 106465 has been marked as a duplicate of this bug. ***

[Bug c/106465] ICE for VLA in struct in parameter of nested function

2023-05-19 Thread muecker at gwdg dot de via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106465

Martin Uecker  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #6 from Martin Uecker  ---

Was filed previously as PR70418

*** This bug has been marked as a duplicate of bug 70418 ***

[V7][PATCH 1/2] Handle component_ref to a structre/union field including flexible array member [PR101832]

2023-05-19 Thread Qing Zhao via Gcc-patches

GCC extension accepts the case when a struct with a flexible array member
is embedded into another struct or union (possibly recursively).
__builtin_object_size should treat such struct as flexible size.

gcc/c/ChangeLog:

PR tree-optimization/101832
* c-decl.cc (finish_struct): Set TYPE_INCLUDE_FLEXARRAY for
struct/union type.

gcc/lto/ChangeLog:

PR tree-optimization/101832
* lto-common.cc (compare_tree_sccs_1): Compare bit
TYPE_NO_NAMED_ARGS_STDARG_P or TYPE_INCLUDE_FLEXARRAY properly
for its corresponding type.

gcc/ChangeLog:

PR tree-optimization/101832
* print-tree.cc (print_node): Print new bit type_include_flexarray.
* tree-core.h (struct tree_type_common): Use bit no_named_args_stdarg_p
as type_include_flexarray for RECORD_TYPE or UNION_TYPE.
* tree-object-size.cc (addr_object_size): Handle structure/union type
when it has flexible size.
* tree-streamer-in.cc (unpack_ts_type_common_value_fields): Stream
in bit no_named_args_stdarg_p properly for its corresponding type.
* tree-streamer-out.cc (pack_ts_type_common_value_fields): Stream
out bit no_named_args_stdarg_p properly for its corresponding type.
* tree.h (TYPE_INCLUDE_FLEXARRAY): New macro TYPE_INCLUDE_FLEXARRAY.

gcc/testsuite/ChangeLog:

PR tree-optimization/101832
* gcc.dg/builtin-object-size-pr101832.c: New test.
---
 gcc/c/c-decl.cc   |  11 ++
 gcc/lto/lto-common.cc |   5 +-
 gcc/print-tree.cc |   5 +
 .../gcc.dg/builtin-object-size-pr101832.c | 134 ++
 gcc/tree-core.h   |   2 +
 gcc/tree-object-size.cc   |  23 ++-
 gcc/tree-streamer-in.cc   |   5 +-
 gcc/tree-streamer-out.cc  |   5 +-
 gcc/tree.h|   7 +-
 9 files changed, 192 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/builtin-object-size-pr101832.c

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index b5b491cf2da..2c620b681d9 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -9282,6 +9282,17 @@ finish_struct (location_t loc, tree t, tree fieldlist, 
tree attributes,
   /* Set DECL_NOT_FLEXARRAY flag for FIELD_DECL x.  */
   DECL_NOT_FLEXARRAY (x) = !is_flexible_array_member_p (is_last_field, x);
 
+  /* Set TYPE_INCLUDE_FLEXARRAY for the context of x, t.
+when x is an array and is the last field.  */
+  if (TREE_CODE (TREE_TYPE (x)) == ARRAY_TYPE)
+   TYPE_INCLUDE_FLEXARRAY (t)
+ = is_last_field && flexible_array_member_type_p (TREE_TYPE (x));
+  /* Recursively set TYPE_INCLUDE_FLEXARRAY for the context of x, t
+when x is an union or record and is the last field.  */
+  else if (RECORD_OR_UNION_TYPE_P (TREE_TYPE (x)))
+   TYPE_INCLUDE_FLEXARRAY (t)
+ = is_last_field && TYPE_INCLUDE_FLEXARRAY (TREE_TYPE (x));
+
   if (DECL_NAME (x)
  || RECORD_OR_UNION_TYPE_P (TREE_TYPE (x)))
saw_named_field = true;
diff --git a/gcc/lto/lto-common.cc b/gcc/lto/lto-common.cc
index 537570204b3..35827aab075 100644
--- a/gcc/lto/lto-common.cc
+++ b/gcc/lto/lto-common.cc
@@ -1275,7 +1275,10 @@ compare_tree_sccs_1 (tree t1, tree t2, tree **map)
   if (AGGREGATE_TYPE_P (t1))
compare_values (TYPE_TYPELESS_STORAGE);
   compare_values (TYPE_EMPTY_P);
-  compare_values (TYPE_NO_NAMED_ARGS_STDARG_P);
+  if (FUNC_OR_METHOD_TYPE_P (t1))
+   compare_values (TYPE_NO_NAMED_ARGS_STDARG_P);
+  if (RECORD_OR_UNION_TYPE_P (t1))
+   compare_values (TYPE_INCLUDE_FLEXARRAY);
   compare_values (TYPE_PACKED);
   compare_values (TYPE_RESTRICT);
   compare_values (TYPE_USER_ALIGN);
diff --git a/gcc/print-tree.cc b/gcc/print-tree.cc
index ccecd3dc6a7..aaded53b1b1 100644
--- a/gcc/print-tree.cc
+++ b/gcc/print-tree.cc
@@ -632,6 +632,11 @@ print_node (FILE *file, const char *prefix, tree node, int 
indent,
  && TYPE_CXX_ODR_P (node))
fputs (" cxx-odr-p", file);
 
+  if ((code == RECORD_TYPE
+  || code == UNION_TYPE)
+ && TYPE_INCLUDE_FLEXARRAY (node))
+   fputs (" include-flexarray", file);
+
   /* The transparent-union flag is used for different things in
 different nodes.  */
   if ((code == UNION_TYPE || code == RECORD_TYPE)
diff --git a/gcc/testsuite/gcc.dg/builtin-object-size-pr101832.c 
b/gcc/testsuite/gcc.dg/builtin-object-size-pr101832.c
new file mode 100644
index 000..60078e11634
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/builtin-object-size-pr101832.c
@@ -0,0 +1,134 @@
+/* PR 101832: 
+   GCC extension accepts the case when a struct with a C99 flexible array
+   member is embedded into another struct (possibly recursively).
+   __builtin_object_size will treat such struct as flexible size.
+   However, when a structure with

[V7][PATCH 0/2]Accept and Handle the case when a structure including a FAM nested in another structure

2023-05-19 Thread Qing Zhao via Gcc-patches

Hi,

This is the 7th version of the patch, which rebased on the latest trunk.
This is an important patch needed by Linux Kernel security project. 

We already have an extensive discussion on this issue and I have went
through 6 revisions of the patches based on the discussion and resolved
all the comments and suggestions raised during the discussion;

compared to the 6th version, the major change are:

1. update the documentation to replace the mentioning of GCC14 with
GCC15.
2. update the documentation to replace the following wording:
"A structure or a union with a C99 flexible array member"
with:
"A structure containing a C99 flexible array member, or a union containing
such a structure,"

All others are the same as 6th version. 

the 6th version are here:

https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616312.html
https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616313.html
https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616314.html

Kees has tested the 6th version of the patch with Linux kernel, and everything
is good. relsolved many false positives for bounds checking.

Notes for the review history of these patches (2 patches)
1.The patch 1/2: Handle component_ref to a structre/union field including
  flexible array member [PR101832]

   The C front-end part has been approved by Joseph.
   For the middle-end, most of the change has been reviewed by Richard
   (and modified based on his comments and suggestions), except the change
   in tree-object-size.cc.
  
2.The patch 2/2: Update documentation to clarify a GCC extension

   This is basically a C FE and documentation change, I have updated it based
   on previous comments and suggestions.
   Joseph, could you review it to see whether this version is ready to go?

bootstrapped and regression tested on aarch64 and x86.

Okay for commit?

thanks a lot.

Qing

(for more details on the review history, I listed other important notes
below:


A. Richard Biener has reviewed the middle-end part of the first patch 
and raised some comments for the 4th version:
https://gcc.gnu.org/pipermail/gcc-patches/2023-March/613643.html

I updated it with his suggestion and Sandra’s comments as 5th version:
https://gcc.gnu.org/pipermail/gcc-patches/2023-March/614100.html

B. The comments for the 5th version:
https://gcc.gnu.org/pipermail/gcc-patches/2023-March/614511.html
(In this one, Joseph approved the C FE change of the first patch).
https://gcc.gnu.org/pipermail/gcc-patches/2023-March/614514.html
(In this one, Joseph raised two comments on the documentation wordings
 for the 2nd patch. And I updated  based on his comment in the 6th version)
)

[Bug middle-end/109907] Missed optimization for bit extraction (uses shift instead of single bit-test)

2023-05-19 Thread gjl at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109907

--- Comment #6 from Georg-Johann Lay  ---
(In reply to Andrew Pinski from comment #4)
> For cset_32bit30_not with some patches which I will be posting, I get:
> bst r25,6;  23  [c=4 l=3]  *extzv/4
> clr r24
> bld r24,0
> ldi r25,lo8(1)   ;  24  [c=4 l=1]  movqi_insn/1
> eor r24,r25  ;  25  [c=4 l=1]  *xorqi3
> /* epilogue start */
> ret  ;  28  [c=0 l=1]  return
> 
> Which is better than what was there before.

Quite impressive improvement.  Maybe the last step can be achieved with a
combiner pattern that combines extzv with a bit flip.

One problem is usually that there is no canonical form (sometimes zero_extract,
sometimes shift+and, sometimes with subregs for extraction or paradoxical
subregs for wider types, different behaviour for MSB, etc.).

avr's extzv currently reads

(define_expand "extzv"
  [(set (match_operand:QI 0 "register_operand")
(zero_extract:QI (match_operand:QI 1 "register_operand")
 (match_operand:QI 2 "const1_operand")
 (match_operand:QI 3 "const_0_to_7_operand")))])

Maybe QI for op1 is not optimal, but it's not possible to use mode iterator
because there's only one gen_extzv.  Dunno if VOIDmode would help or is sane.

> The first one I suspect load_extend_op for SImode returning SIGN_EXTEND for
> avr.

It's not implemented for avr, thus UNKNOWN as of defaults.h.

[C PATCH] Remove dead code related to type compatibility across TUs.

2023-05-19 Thread Martin Uecker via Gcc-patches



Repost for stage 1.


C: Remove dead code related to type compatibility across TUs.

Code to detect struct/unions across the same TU is not needed
anymore. Code for determining compatibility of tagged types is
preserved as it will be used for C2X. Some errors in the unused
code are fixed.

Bootstrapped with no regressions for x86_64-pc-linux-gnu.

gcc/c/
* c-decl.cc (set_type_context): Remove.
(pop_scope, diagnose_mismatched_decls, pushdecl):
Remove dead code.
* c-typeck.cc (comptypes_internal): Remove dead code.
(same_translation_unit_p): Remove.
(tagged_types_tu_compatible_p): Some fixes.

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index f63c1108ab5..70345b4b019 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -1155,16 +1155,6 @@ update_label_decls (struct c_scope *scope)
 }
 }
 
-/* Set the TYPE_CONTEXT of all of TYPE's variants to CONTEXT.  */
-
-static void
-set_type_context (tree type, tree context)
-{
-  for (type = TYPE_MAIN_VARIANT (type); type;
-   type = TYPE_NEXT_VARIANT (type))
-TYPE_CONTEXT (type) = context;
-}
-
 /* Exit a scope.  Restore the state of the identifier-decl mappings
that were in effect when this scope was entered.  Return a BLOCK
node containing all the DECLs in this scope that are of interest
@@ -1253,7 +1243,6 @@ pop_scope (void)
case ENUMERAL_TYPE:
case UNION_TYPE:
case RECORD_TYPE:
- set_type_context (p, context);
 
  /* Types may not have tag-names, in which case the type
 appears in the bindings list with b->id NULL.  */
@@ -1364,12 +1353,7 @@ pop_scope (void)
 the TRANSLATION_UNIT_DECL.  This makes same_translation_unit_p
 work.  */
  if (scope == file_scope)
-   {
  DECL_CONTEXT (p) = context;
- if (TREE_CODE (p) == TYPE_DECL
- && TREE_TYPE (p) != error_mark_node)
-   set_type_context (TREE_TYPE (p), context);
-   }
 
  gcc_fallthrough ();
  /* Parameters go in DECL_ARGUMENTS, not BLOCK_VARS, and have
@@ -2318,21 +2302,18 @@ diagnose_mismatched_decls (tree newdecl, tree olddecl,
{
  if (DECL_INITIAL (olddecl))
{
- /* If both decls are in the same TU and the new declaration
-isn't overriding an extern inline reject the new decl.
-In c99, no overriding is allowed in the same translation
-unit.  */
- if ((!DECL_EXTERN_INLINE (olddecl)
-  || DECL_EXTERN_INLINE (newdecl)
-  || (!flag_gnu89_inline
-  && (!DECL_DECLARED_INLINE_P (olddecl)
-  || !lookup_attribute ("gnu_inline",
-DECL_ATTRIBUTES (olddecl)))
-  && (!DECL_DECLARED_INLINE_P (newdecl)
-  || !lookup_attribute ("gnu_inline",
-DECL_ATTRIBUTES (newdecl
- )
- && same_translation_unit_p (newdecl, olddecl))
+ /* If the new declaration isn't overriding an extern inline
+reject the new decl. In c99, no overriding is allowed
+in the same translation unit.  */
+ if (!DECL_EXTERN_INLINE (olddecl)
+ || DECL_EXTERN_INLINE (newdecl)
+ || (!flag_gnu89_inline
+ && (!DECL_DECLARED_INLINE_P (olddecl)
+ || !lookup_attribute ("gnu_inline",
+   DECL_ATTRIBUTES (olddecl)))
+ && (!DECL_DECLARED_INLINE_P (newdecl)
+ || !lookup_attribute ("gnu_inline",
+   DECL_ATTRIBUTES (newdecl)
{
  auto_diagnostic_group d;
  error ("redefinition of %q+D", newdecl);
@@ -3360,18 +3341,11 @@ pushdecl (tree x)
 type to the composite of all the types of that declaration.
 After the consistency checks, it will be reset to the
 composite of the visible types only.  */
-  if (b && (TREE_PUBLIC (x) || same_translation_unit_p (x, b->decl))
- && b->u.type)
+  if (b && b->u.type)
TREE_TYPE (b->decl) = b->u.type;
 
-  /* The point of the same_translation_unit_p check here is,
-we want to detect a duplicate decl for a construct like
-foo() { extern bar(); } ... static bar();  but not if
-they are in different translation units.  In any case,
-the static does not go in the externals scope.  */
-  if (b
- && (TREE_PUBLIC (x) || same_translation_unit_p (x, b->decl))
- && duplicate_decls (x, b->decl))
+  /* the static does not go in the externals scope.  */
+  if (b && duplicate_decls (x, b->decl))

[Bug middle-end/109907] Missed optimization for bit extraction (uses shift instead of single bit-test)

2023-05-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109907

--- Comment #5 from Andrew Pinski  ---
Created attachment 55121
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55121=edit
patch set

here is the patch set that improves cset_32bit30_not . I am still looking into
improving the other one.

[Bug middle-end/109907] Missed optimization for bit extraction (uses shift instead of single bit-test)

2023-05-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109907

--- Comment #4 from Andrew Pinski  ---
For cset_32bit30_not with some patches which I will be posting, I get:
bst r25,6;  23  [c=4 l=3]  *extzv/4
clr r24
bld r24,0
ldi r25,lo8(1)   ;  24  [c=4 l=1]  movqi_insn/1
eor r24,r25  ;  25  [c=4 l=1]  *xorqi3
/* epilogue start */
ret  ;  28  [c=0 l=1]  return

Which is better than what was there before.

The way I get this is to use BIT_FIELD_REF inside fold_single_bit_test .

The first one I suspect load_extend_op for SImode returning SIGN_EXTEND for
avr.

[Bug other/109910] GCC prologue/epilogue saves/restores callee-saved registers that are never changed

2023-05-19 Thread gjl at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109910

--- Comment #1 from Georg-Johann Lay  ---
Note that df_regs_ever_live_p may be used before reload_completed, for example
in INITIAL_ELIMINATION_OFFSET.

Hence, scanning the insns by hand using, say, note_stores, does not work
because reload might still be in progress.

Re: [PATCH v4 4/4] ree: Improve ree pass for rs6000 target using defined ABI interfaces.

2023-05-19 Thread Jeff Law via Gcc-patches





On 5/16/23 06:35, Ajit Agarwal wrote:



On 29/04/23 5:03 am, Jeff Law wrote:



On 4/28/23 16:42, Hans-Peter Nilsson wrote:

On Sat, 22 Apr 2023, Ajit Agarwal via Gcc-patches wrote:


Hello All:

This new version of patch 4 use improve ree pass for rs6000 target using 
defined ABI interfaces.
Bootstrapped and regtested on power64-linux-gnu.

Thanks & Regards
Ajit


 ree: Improve ree pass for rs6000 target using defined abi interfaces

  For rs6000 target we see redundant zero and sign
  extension and done to improve ree pass to eliminate
  such redundant zero and sign extension using defines
  ABI interfaces.

  2023-04-22  Ajit Kumar Agarwal  

gcc/ChangeLog:

  * ree.cc (combline_reaching_defs): Add zero_extend
  using defined abi interfaces.
  (add_removable_extension): use of defined abi interfaces
  for no reaching defs.
  (abi_extension_candidate_return_reg_p): New defined ABI function.
  (abi_extension_candidate_p): New defined ABI function.
  (abi_extension_candidate_argno_p): New defined ABI function.
  (abi_handle_regs_without_defs_p): New defined ABI function.

gcc/testsuite/ChangeLog:

  * g++.target/powerpc/zext-elim-3.C
---
   gcc/ree.cc    | 176 +++---
   .../g++.target/powerpc/zext-elim-3.C  |  16 ++
   2 files changed, 162 insertions(+), 30 deletions(-)
   create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim-3.C

diff --git a/gcc/ree.cc b/gcc/ree.cc
index 413aec7c8eb..0de96b1ece1 100644
--- a/gcc/ree.cc
+++ b/gcc/ree.cc
@@ -473,7 +473,8 @@ get_defs (rtx_insn *insn, rtx reg, vec *dest)
   break;
   }
   -  gcc_assert (use != NULL);
+  if (use == NULL)
+    return NULL;
       ref_chain = DF_REF_CHAIN (use);
   @@ -514,7 +515,8 @@ get_uses (rtx_insn *insn, rtx reg)
   if (REGNO (DF_REF_REG (def)) == REGNO (reg))
     break;
   -  gcc_assert (def != NULL);
+  if (def == NULL)
+    return NULL;
       ref_chain = DF_REF_CHAIN (def);
   @@ -750,6 +752,103 @@ get_extended_src_reg (rtx src)
     return src;
   }
   +/* Return TRUE if the candidate insn is zero extend and regno is
+   an return  registers.  */
+
+static bool
+abi_extension_candidate_return_reg_p (rtx_insn *insn, int regno)
+{
+  rtx set = single_set (insn);
+
+  if (GET_CODE (SET_SRC (set)) !=  ZERO_EXTEND)
+    return false;
+
+  if (FUNCTION_VALUE_REGNO_P (regno))
+    return true;
+
+  return false;
+}
+
+/* Return TRUE if reg source operand of zero_extend is argument registers
+   and not return registers and source and destination operand are same
+   and mode of source and destination operand are not same.  */
+
+static bool
+abi_extension_candidate_p (rtx_insn *insn)
+{
+  rtx set = single_set (insn);
+
+  if (GET_CODE (SET_SRC (set)) !=  ZERO_EXTEND)
+    return false;
+
+  machine_mode ext_dst_mode = GET_MODE (SET_DEST (set));
+  rtx orig_src = XEXP (SET_SRC (set),0);
+
+  bool copy_needed
+    = (REGNO (SET_DEST (set)) != REGNO (XEXP (SET_SRC (set), 0)));
+
+  if (!copy_needed && ext_dst_mode != GET_MODE (orig_src)
+  && FUNCTION_ARG_REGNO_P (REGNO (orig_src))
+  && !abi_extension_candidate_return_reg_p (insn, REGNO (orig_src)))
+    return true;
+
+  return false;
+}
+
+/* Return TRUE if the candidate insn is zero extend and regno is
+   an argument registers.  */
+
+static bool
+abi_extension_candidate_argno_p (rtx_code code, int regno)
+{
+  if (code !=  ZERO_EXTEND)
+    return false;
+
+  if (FUNCTION_ARG_REGNO_P (regno))
+    return true;
+
+  return false;
+}


I don't see anything in those functions that checks if
ZERO_EXTEND is actually a feature of the ABI, e.g. as opposed to
no extension or SIGN_EXTEND.  Do I miss something?

I don't think you missed anything.  That was one of the points I was making 
last week.  Somewhere, somehow we need to describe what the ABI mandates and 
guarantees.

So while what Ajit has done is a step forward, at some point the actual details 
of the ABI need to be described in a way that can be checked and consumed by 
REE.



The ABI we need for ree pass are the argument registers and return registers. 
Based on that I have described interfaces that we need. Other than that we dont 
any other ABI hooks. I have used FUNCTION_VALUE_REGNO_P and 
FuNCTION_ARG_REGNO_P abi hooks.
You're working with one of many ABIs, some of which have useful 
properties, some of which do not.


Simply testing FUNCTION_VALUE_REGNO_P/FUNCTION_ARG_REGNO_P is not 
sufficient.  You need to be able to query the ABI properties.


jeff

Re: [PATCH] MIPS: don't expand large block move

2023-05-19 Thread Maciej W. Rozycki

On Fri, 19 May 2023, Jeff Law wrote:

> > diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc
> > index ca491b981a3..00f26d5e923 100644
> > --- a/gcc/config/mips/mips.cc
> > +++ b/gcc/config/mips/mips.cc
> > @@ -8313,6 +8313,12 @@ mips_expand_block_move (rtx dest, rtx src, rtx
> > length)
> > }
> > else if (optimize)
> > {
> > + /* When the length is big enough, the lib call has better performace
> > +than load/store insns.
> > +In most platform, the value is about 64-128.
> > +And in fact lib call may be optimized with SIMD */
> > + if (INTVAL(length) >= 64)
> > +   return false;
> Just a formatting nit.  Space between INTVAL and the open paren for its
> argument list.

 This is oddly wrapped too.  I'd move "performace" (typo there!) to the 
second line, to align better with the rest of the text.

 Plus s/platform/platforms/ and there's a full stop missing along with two 
spaces at the end.  Also there's inconsistent style around <= and >=; the 
GNU Coding Standards ask for spaces around binary operators.  And "don't" 
in the change heading ought to be capitalised.

 In fact, I'd justify the whole paragraph as each sentence doesn't have to 
start on a new line, and the commit description could benefit from some 
reformatting too, as it's now odd to read.

> OK with that change.

 I think the conditional would be better readable if it was flattened 
though:

  if (INTVAL (length) <= MIPS_MAX_MOVE_BYTES_STRAIGHT)
...
  else if (INTVAL (length) >= 64)
...
  else if (optimize)
...

or even:

  if (INTVAL (length) <= MIPS_MAX_MOVE_BYTES_STRAIGHT)
...
  else if (INTVAL (length) < 64 && optimize)
...

One just wouldn't write it as proposed if creating the whole piece from 
scratch rather than retrofitting this extra conditional.

 Ultimately it may have to be tunable as LWL/LWR, etc. may be subject to 
fusion and may be faster after all.

  Maciej

Re: [PATCH 08/14] fortran: use _P() defines from tree.h

2023-05-19 Thread Bernhard Reutner-Fischer via Gcc-patches

On Thu, 18 May 2023 21:20:41 +0200
Mikael Morin  wrote:

> Le 18/05/2023 à 17:18, Bernhard Reutner-Fischer a écrit :

> > I've fed gfortran.h into the script and found some CLASS_DATA spots,
> > see attached bootstrapped and tested patch.
> > Do we want to have that?  
> Some of it makes sense, but not all of it.
> 
> It is a macro to access the _data component of a class container.
> So for class-related stuff it makes sense to use CLASS_DATA, and 
> typically there will be a check that the type is BT_CLASS before.
> But for cases where we loop over all of the components of a type that is 
> not necessarily a class container, it doesn't make sense to use CLASS_DATA.
> 
> So I suggest to only keep the following hunks.
[]
> OK for those hunks.

Pushed those as r14-1001-g05b7cc7daac8b3
Many thanks!

PS: I'm attaching the fugly script i used to do these macro
replacements FYA.


use-defines.1.awk
Description: application/awk

Re: [PATCH] c++: mangle noexcept-expr [PR70790]

2023-05-19 Thread Patrick Palka via Gcc-patches

On Fri, 19 May 2023, Patrick Palka wrote:

> This implements noexcept-expr mangling (and demangling) as per the
> Itanium ABI.
> 
> Bootstrapped and regtested on x86_64-pc-linux-gnu, does this
> look OK for trunk?
> 
>   PR c++/70790
> 
> gcc/cp/ChangeLog:
> 
>   * mangle.cc (write_expression): Handle NOEXCEPT_EXPR.
> 
> libiberty/ChangeLog:
> 
>   * cp-demangle.c (cplus_demangle_operators): Add the noexcept
>   operator.

Oops, we should also make sure we print parens around the operand of
noexcept.  Otherwise we'd demangle the mangling of e.g.

  void f(A)

instead as

  void f(A)

Fixed in the following patch:

-- >8 --

Subject: [PATCH] c++: mangle noexcept-expr [PR70790]

This implements noexcept-expr mangling (and demangling) as per the
Itanium ABI.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this
look OK for trunk?

PR c++/70790

gcc/cp/ChangeLog:

* mangle.cc (write_expression): Handle NOEXCEPT_EXPR.

libiberty/ChangeLog:

* cp-demangle.c (cplus_demangle_operators): Add the noexcept
operator.
(d_print_comp_inner) : Always
print parens around the operand of noexcept too.
* testsuite/demangle-expected: Test noexcept operator
demangling.

gcc/testsuite/ChangeLog:

* g++.dg/abi/mangle78.C: New test.
---
 gcc/cp/mangle.cc  |  5 +
 gcc/testsuite/g++.dg/abi/mangle78.C   | 14 ++
 libiberty/cp-demangle.c   |  5 +++--
 libiberty/testsuite/demangle-expected |  3 +++
 4 files changed, 25 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/abi/mangle78.C

diff --git a/gcc/cp/mangle.cc b/gcc/cp/mangle.cc
index 826c5e76c1d..7dab4e62bc9 100644
--- a/gcc/cp/mangle.cc
+++ b/gcc/cp/mangle.cc
@@ -3402,6 +3402,11 @@ write_expression (tree expr)
   else
write_string ("tr");
 }
+  else if (code == NOEXCEPT_EXPR)
+{
+  write_string ("nx");
+  write_expression (TREE_OPERAND (expr, 0));
+}
   else if (code == CONSTRUCTOR)
 {
   bool braced_init = BRACE_ENCLOSED_INITIALIZER_P (expr);
diff --git a/gcc/testsuite/g++.dg/abi/mangle78.C 
b/gcc/testsuite/g++.dg/abi/mangle78.C
new file mode 100644
index 000..63c4d779e9f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/abi/mangle78.C
@@ -0,0 +1,14 @@
+// PR c++/70790
+// { dg-do compile { target c++11 } }
+
+template
+struct A { };
+
+template
+void f(A);
+
+int main() {
+  f({});
+}
+
+// { dg-final { scan-assembler "_Z1fIiEv1AIXnxtlT_EEE" } }
diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c
index f2b36bcad68..efada1c322b 100644
--- a/libiberty/cp-demangle.c
+++ b/libiberty/cp-demangle.c
@@ -1947,6 +1947,7 @@ const struct demangle_operator_info 
cplus_demangle_operators[] =
   { "ng", NL ("-"), 1 },
   { "nt", NL ("!"), 1 },
   { "nw", NL ("new"),   3 },
+  { "nx", NL ("noexcept"),  1 },
   { "oR", NL ("|="),2 },
   { "oo", NL ("||"),2 },
   { "or", NL ("|"), 2 },
@@ -5836,8 +5837,8 @@ d_print_comp_inner (struct d_print_info *dpi, int options,
if (code && !strcmp (code, "gs"))
  /* Avoid parens after '::'.  */
  d_print_comp (dpi, options, operand);
-   else if (code && !strcmp (code, "st"))
- /* Always print parens for sizeof (type).  */
+   else if (code && (!strcmp (code, "st") || !strcmp (code, "nx")))
+ /* Always print parens for sizeof (type) or noexcept(expr).  */
  {
d_append_char (dpi, '(');
d_print_comp (dpi, options, operand);
diff --git a/libiberty/testsuite/demangle-expected 
b/libiberty/testsuite/demangle-expected
index d9bc7ed4b1f..52dff883a18 100644
--- a/libiberty/testsuite/demangle-expected
+++ b/libiberty/testsuite/demangle-expected
@@ -1659,3 +1659,6 @@ auto f()::{lambda(X<$T0>*, 
X*)#1}::operator()(X*,
 
 _ZZN1XIiE1FEvENKUliE_clEi
 X::F()::{lambda(int)#1}::operator()(int) const
+
+_Z1fIiEv1AIXnxtlT_EEE
+void f(A)
-- 
2.41.0.rc0.4.g004e0f790f

>   * testsuite/demangle-expected: Test noexcept operator
>   demangling.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/abi/mangle78.C: New test.
> ---
>  gcc/cp/mangle.cc  |  5 +
>  gcc/testsuite/g++.dg/abi/mangle78.C   | 14 ++
>  libiberty/cp-demangle.c   |  1 +
>  libiberty/testsuite/demangle-expected |  3 +++
>  4 files changed, 23 insertions(+)
>  create mode 100644 gcc/testsuite/g++.dg/abi/mangle78.C
> 
> diff --git a/gcc/cp/mangle.cc b/gcc/cp/mangle.cc
> index 826c5e76c1d..7dab4e62bc9 100644
> --- a/gcc/cp/mangle.cc
> +++ b/gcc/cp/mangle.cc
> @@ -3402,6 +3402,11 @@ write_expression (tree expr)
>else
>   write_string ("tr");
>  }
> +  else if (code == NOEXCEPT_EXPR)
> +{
> +  write_string ("nx");
> +  write_expression (TREE_OPERAND (expr, 0));
> +}
>else if (code == CONSTRUCTOR)
>  {
>bool braced_init = BRACE_ENCLOSED_INITIALIZER_P (expr);
> diff --git

[PATCH v2] release the sorted FDE array when deregistering a frame [PR109685]

2023-05-19 Thread Thomas Neumann via Gcc-patches


Am 19.05.23 um 19:26 schrieb Jeff Law:

See:
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/617245.html

I think this needs an update given the other changes in this space.

jeff


I have included the updated the patch below.



The atomic fastpath bypasses the code that releases the sort
array which was lazily allocated during unwinding. We now
check after deregistering if there is an array to free.

libgcc/ChangeLog:
* unwind-dw2-fde.c: Free sort array in atomic fast path.
---
 libgcc/unwind-dw2-fde.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/libgcc/unwind-dw2-fde.c b/libgcc/unwind-dw2-fde.c
index a5786bf729c..32b9e64a1c8 100644
--- a/libgcc/unwind-dw2-fde.c
+++ b/libgcc/unwind-dw2-fde.c
@@ -241,6 +241,12 @@ __deregister_frame_info_bases (const void *begin)
   // And remove
   ob = btree_remove (_frames, range[0]);
   bool empty_table = (range[1] - range[0]) == 0;
+
+  // Deallocate the sort array if any.
+  if (ob && ob->s.b.sorted)
+{
+  free (ob->u.sort);
+}
 #else
   init_object_mutex_once ();
   __gthread_mutex_lock (_mutex);
--
2.39.2

[PATCH] c++: mangle noexcept-expr [PR70790]

2023-05-19 Thread Patrick Palka via Gcc-patches

This implements noexcept-expr mangling (and demangling) as per the
Itanium ABI.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this
look OK for trunk?

PR c++/70790

gcc/cp/ChangeLog:

* mangle.cc (write_expression): Handle NOEXCEPT_EXPR.

libiberty/ChangeLog:

* cp-demangle.c (cplus_demangle_operators): Add the noexcept
operator.
* testsuite/demangle-expected: Test noexcept operator
demangling.

gcc/testsuite/ChangeLog:

* g++.dg/abi/mangle78.C: New test.
---
 gcc/cp/mangle.cc  |  5 +
 gcc/testsuite/g++.dg/abi/mangle78.C   | 14 ++
 libiberty/cp-demangle.c   |  1 +
 libiberty/testsuite/demangle-expected |  3 +++
 4 files changed, 23 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/abi/mangle78.C

diff --git a/gcc/cp/mangle.cc b/gcc/cp/mangle.cc
index 826c5e76c1d..7dab4e62bc9 100644
--- a/gcc/cp/mangle.cc
+++ b/gcc/cp/mangle.cc
@@ -3402,6 +3402,11 @@ write_expression (tree expr)
   else
write_string ("tr");
 }
+  else if (code == NOEXCEPT_EXPR)
+{
+  write_string ("nx");
+  write_expression (TREE_OPERAND (expr, 0));
+}
   else if (code == CONSTRUCTOR)
 {
   bool braced_init = BRACE_ENCLOSED_INITIALIZER_P (expr);
diff --git a/gcc/testsuite/g++.dg/abi/mangle78.C 
b/gcc/testsuite/g++.dg/abi/mangle78.C
new file mode 100644
index 000..a3647711604
--- /dev/null
+++ b/gcc/testsuite/g++.dg/abi/mangle78.C
@@ -0,0 +1,14 @@
+// PR c++/70790
+// { dg-do compile { target c++11 } }
+
+template
+struct A { };
+
+template
+void f(A);
+
+int main() {
+  f({});
+}
+
+// { dg-final { scan-assembler "_Z1fIiEv1AIXnxcvT__EEE" } }
diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c
index f2b36bcad68..341c66db919 100644
--- a/libiberty/cp-demangle.c
+++ b/libiberty/cp-demangle.c
@@ -1947,6 +1947,7 @@ const struct demangle_operator_info 
cplus_demangle_operators[] =
   { "ng", NL ("-"), 1 },
   { "nt", NL ("!"), 1 },
   { "nw", NL ("new"),   3 },
+  { "nx", NL ("noexcept"),  1 },
   { "oR", NL ("|="),2 },
   { "oo", NL ("||"),2 },
   { "or", NL ("|"), 2 },
diff --git a/libiberty/testsuite/demangle-expected 
b/libiberty/testsuite/demangle-expected
index d9bc7ed4b1f..7195cc39c19 100644
--- a/libiberty/testsuite/demangle-expected
+++ b/libiberty/testsuite/demangle-expected
@@ -1659,3 +1659,6 @@ auto f()::{lambda(X<$T0>*, 
X*)#1}::operator()(X*,
 
 _ZZN1XIiE1FEvENKUliE_clEi
 X::F()::{lambda(int)#1}::operator()(int) const
+
+_Z1fIiEv1AIXnxcvT__EEE
+void f(A)
-- 
2.41.0.rc0.4.g004e0f790f

[Bug c++/108788] Lookup of injected class name should be type-dependent

2023-05-19 Thread ppalka at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108788

Patrick Palka  changed:

   What|Removed |Added

 CC||jason at gcc dot gnu.org,
   ||ppalka at gcc dot gnu.org

--- Comment #2 from Patrick Palka  ---
Partially fixed by r12-3643-g18b57c1d4a8777.  Reduced version of what we still
reject:

template 
struct templ_base { };

template 
int get_templ_base(T&& v)
{
return v.templ_base::a; // fails in all gcc versions
}

: In function ‘int get_templ_base(T&&)’:
:7:14: error: ‘template struct templ_base’ used without
template arguments

[Bug preprocessor/109912] New: #pragma GCC diagnostic ignored "-Wall" is ignored

2023-05-19 Thread ed at catmur dot uk via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109912

Bug ID: 109912
   Summary: #pragma GCC diagnostic ignored "-Wall" is ignored
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: preprocessor
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ed at catmur dot uk
  Target Milestone: ---

#pragma GCC diagnostic warning "-Wall"
#pragma GCC diagnostic ignored "-Wall"
int i = 0 | 1 & 2;

warning: suggest parentheses around arithmetic in operand of '|'
[-Wparentheses]
3 | int i = 0 | 1 & 2;
  | ~~^~~

The expected behavior would be for `diagnostic ignored "-Wall"` to suppress all
the warnings that were enabled by `diagnostic warning "-Wall"`. If this isn't
possible, it would be good to emit a diagnostic that `diagnostic ignored
"-Wall"` has no effect.

Clang does support this and appears to have always done so.

[Bug target/109279] RISC-V: complex constants synthesized should be improved

2023-05-19 Thread vineetg at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279

--- Comment #16 from Vineet Gupta  ---
> Which is what this produces:
> ```
> long long f(void)
> {
>   unsigned t = 16843009;
>   long long t1 = t;
>   long long t2 = ((unsigned long long )t) << 32;
>   asm("":"+r"(t1));
>   return t1 | t2;
> }
> ```
> I suspect: 0x0080402010080400ULL should be done as two 32bit with a shift/or
> added too. Will definitely improve complex constants forming too.
> 
> Right now the backend does (const<<16+const)<<16+const... which is just so
> bad.

Umm this testcase is a different problem. It used to generate the same output
but no longer after g2e886eef7f2b5a and the other related updates:
g0530254413f8 and gc104ef4b5eb1.

For the test above, the low and high words are created independently and then
stitched.

260r.dfinit

# lower word

(insn 6 2 7 2 (set (reg:DI 138)
(const_int [0x101]))  {*movdi_64bit}
(insn 7 6 8 2 (set (reg:DI 137)
(plus:DI (reg:DI 138)
(const_int [0x101]))) {adddi3}
 (expr_list:REG_EQUAL (const_int [0x1010101]) )
(insn 5 8 9 2 (set (reg/v:DI 134 [ t1 ])
(reg:DI 136 [ t1 ])) {*movdi_64bit}

# upper word created independently, no reuse from prior values)

(insn 9 5 10 2 (set (reg:DI 141)
(const_int [0x101]))  {*movdi_64bit}
(insn 10 9 11 2 (set (reg:DI 142)
(plus:DI (reg:DI 141)
(const_int [0x101]))) {adddi3}
(insn 11 10 12 2 (set (reg:DI 140)
(ashift:DI (reg:DI 142)
(const_int 32 [0x20]))) {ashldi3}
(expr_list:REG_EQUAL (const_int [0x1010101]))

# stitch them
(insn 12 11 13 2 (set (reg:DI 139)
(ior:DI (reg/v:DI 134 [ t1 ])
(reg:DI 140))) "const2.c":7:13 99 {iordi3}


cse1 matches the new "*mvconst_internal" pattern independently on each of them 

(insn 7 6 8 2 (set (reg:DI 137)
(const_int [0x1010101])) {*mvconst_internal}
(expr_list:REG_EQUAL (const_int [0x1010101])))

(insn 11 10 12 2 (set (reg:DI 140)
(const_int [0x1010101_])) {*mvconst_internal}
(expr_list:REG_EQUAL (const_int   
[0x1010101_]) ))

This ultimately gets in the way, as otherwise it would find the equivalent reg
across the 2 snippets and reuse reg.

It is interesting that due to same pattern, split1 undoes what cse1 did so in
theory cse2 ? could redo it it. Anyhow needs to be investigated. But ATM we
have the following codegen for the aforementioned test which clearly needs more
work.

li  a0,16842752
addia0,a0,257
li  a5,16842752
sllia0,a0,32
addia5,a5,257
or  a0,a5,a0
ret

1 2 3 >

1 - 100 of 209 matches

Mail list logo