[gcc(refs/users/aoliva/heads/testme)] fold fold_truth_andor into ifcombine
https://gcc.gnu.org/g:dea162ded07463211c60233b35c019963025c547 commit dea162ded07463211c60233b35c019963025c547 Author: Alexandre Oliva Date: Thu Sep 26 02:10:44 2024 -0300 fold fold_truth_andor into ifcombine This patch introduces various improvements to the logic that merges field compares, moving it into ifcombine. Before the patch, we could merge: (a.x1 EQNE b.x1) ANDOR (a.y1 EQNE b.y1) into something like: (((type *)&a)[Na] & MASK) EQNE (((type *)&b)[Nb] & MASK) if both of A's fields live within the same alignment boundaries, and so do B's, at the same relative positions. Constants may be used instead of the object B. The initial goal of this patch was to enable such combinations when a field crossed alignment boundaries, e.g. for packed types. We can't generally access such fields with a single memory access, so when we come across such a compare, we will attempt to combine each access separately. Some merging opportunities were missed because of right-shifts, compares expressed as e.g. ((a.x1 ^ b.x1) & MASK) EQNE 0, and narrowing conversions, especially after earlier merges. This patch introduces handlers for several cases involving these. The merging of multiple field accesses into wider bitfield-like accesses is undesirable to do too early in compilation, so we move it from folding to ifcombine, and extend ifcombine to merge noncontiguous compares, absent intervening side effects. VUSEs used to prevent ifcombine; that seemed excessively conservative, since relevant side effects were already tested, including the possibility of trapping loads, so that's removed. Unlike earlier ifcombine, when merging noncontiguous compares the merged compare must replace the earliest compare, which may require moving up the DEFs that contributed to the latter compare. When it is the second of a noncontiguous pair of compares that first accesses a word, we may merge the first compare with part of the second compare that refers to the same word, keeping the compare of the remaining bits at the spot where the second compare used to be. Handling compares with non-constant fields was somewhat generalized from what fold used to do, now handling non-adjacent fields, even if a field of one object crosses an alignment boundary but the other doesn't. The -Wno-error for toplev.o on rs6000 is because of toplev.c's: if ((flag_sanitize & SANITIZE_ADDRESS) && !FRAME_GROWS_DOWNWARD) and rs6000.h's: #define FRAME_GROWS_DOWNWARD (flag_stack_protect != 0 \ || (flag_sanitize & SANITIZE_ADDRESS) != 0) The mutually exclusive conditions involving flag_sanitize are now noticed and reported by ifcombine's warning on mutually exclusive compares. i386's needs -Wno-error for insn-attrtab.o for similar reasons. for gcc/ChangeLog * fold-const.cc (make_bit_field): Export. (unextend, decode_field_reference, fold_truth_andor_1): Moved field compare merging logic... * gimple-fold.cc: ... here. (ssa_is_substitutable_p, is_cast_p, is_binop_p): New. (prepare_xor, follow_load): New. (compute_split_boundary_from_align): New. (make_bit_field_load, build_split_load): New. (reuse_split_load, mergeable_loads_p): New. (fold_truth_andor_maybe_separate): New. * tree-ssa-ifcombine.cc: Include bitmap.h. (constant_condition_p): New. (recognize_if_then_else_nc, recognize_if_succs): New. (bb_no_side_effects_p): Don't reject VUSEs. (update_profile_after_ifcombine): Adjust for noncontiguous merges. (ifcombine_mark_ssa_name): New. (struct ifcombine_mark_ssa_name_t): New. (ifcombine_mark_ssa_name_walk): New. (ifcombine_replace_cond): Extended for noncontiguous merges after factoring out of... (ifcombine_ifandif): ... this. Drop result_inv arg. Try fold_truth_andor_maybe_separate. (tree_ssa_ifcombine_bb_1): Add outer_succ_bb arg. Call recognize_if_then_else_nc. Adjust ifcombine_ifandif calls. (tree_ssa_ifcombine_bb): Return the earliest affected block. Call recognize_if_then_else_nc. Try noncontiguous blocks. (pass_tree_ifcombine::execute): Retry affected blocks. * config/i386/t-i386 (insn-attrtab.o-warn): Disable errors. * config/rs6000/t-rs6000 (toplev.o-warn): Likewise. for gcc/testsuite/ChangeLog * gcc.dg/field-merge-1.c: New. * gcc.dg/field-merge-2.c: New. * gcc.dg/field-merge-3.c: New.
[gcc/aoliva/heads/testme] (213 commits) fold fold_truth_andor into ifcombine
The branch 'aoliva/heads/testme' was updated to point to: dea162ded074... fold fold_truth_andor into ifcombine It previously pointed to: f88b4c43a2a4... adjust probs after modified ifcombine Diff: !!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST): --- f88b4c4... adjust probs after modified ifcombine a29037a... support noncontiguous ifcombine 3ed1ed8... refactor ifcombine b0b68cb... support noncontiguous ifcombine 575a4da... relax ifcombine to accept vuses 15a55a9... fold truth-and only in ifcombine 6ce741d... check for mergeable loads, choose insertion points accordin d041471... rework truth_andor folding into tree-ssa-ifcombine d675d49... assorted improvements for fold_truth_andor_1 Summary of changes (added commits): --- dea162d... fold fold_truth_andor into ifcombine 85910e6... x86: Extend AVX512 Vectorization for Popcount in Various Mo (*) 78eef89... Define VECTOR_STORE_FLAG_VALUE (*) 064d5c6... testsuite: Fix testcase g++.dg/modules/indirect-1_b.C [PR11 (*) 12c8cb8... RISC-V: Add testcases for form 3 of signed vector SAT_ADD (*) 342221f... Match: Support form 3 for vector signed integer .SAT_ADD (*) 9d76276... Daily bump. (*) 14cd108... gfortran testsuite: Remove unit-files in files having open- (*) 6fee826... testsuite: XFAIL g++.dg/modules/indirect-1_b.C (*) d5864b9... testsuite: fix dejagnu typos with underscores (*) 0b953ce... doc: Remove @code wrapping of fortran option names [PR11680 (*) cc40795... i386: Add GENERIC and GIMPLE folders of __builtin_ia32_{min (*) c79cc30... x86: Don't use address override with segment regsiter (*) ed6dccd... ltmain.sh: allow more flags at link-time (*) 82d9727... libstdc++: testsuite: fix dg-bogus directive syntax (*) 3308e82... Fix testsuite failure on 32-bit targets. (*) d1e7f3a... Add an alternative testcase for PR 70740 (*) 6c5543d... match: Fix `a != 0 ? a * b : 0` patterns for things that tr (*) 7cf85d1... c++: Add testcase for DR 2874 (*) 0564d95... c++: Add testcase for DR 2836 (*) 340ef96... c++: Add testcase for DR 2728 (*) a88d6c6... match: Fix A || B not optimized to true when !B implies A [ (*) 0e095df... Speed up get_bitmask_from_range (*) 6efc770... Speed up wide_int_storage::operator=(wide_int_storage const (*) 1fea6f8... c++: use TARGET_EXPR accessors (*) 08b8341... match: Change (A * B) + (-C) to (B - C/A) * A, if C multipl (*) af8ff00... remove dominator recursion from reassoc (*) 9b76263... Remove recursion in simplify_control_stmt_condition_1 [PR11 (*) 63a598d... libstdc++: #ifdef out #pragma GCC system_header (*) 2407dbe... libstdc++: more #pragma diagnostic (*) 7ad17fe... Use tree view for find_always_executed_bbs result (*) fcff9c3... OpenMP: Update OMP_REQUIRES_TARGET_USED for declare_target (*) 5d87b98... RISC-V: Cleanup debug code for SAT_* testcases [NFC] (*) cc141b5... rtl-optimization/114855 - slow add_store_equivs in IRA (*) 0b2d3bf... Disable add_store_equivs when -fno-expensive-optimizations (*) caf3fe7... tree-optimization/114855 - slow VRP due to equiv oracle que (*) 5b652b0... RISC-V: Refine the testcase of vector SAT_TRUNC (*) 32bcca3... RISC-V: Refine the testcase of vector SAT_SUB (*) 043d607... RISC-V: Refine the testcase of vector SAT_ADD (*) 742d242... i386: Update the comment for mapxf option (*) 6935bdd... OpenMP: Fix testsuite failure on x86 with -m32 (*) 2d8392c... Daily bump. (*) 291e20e... Add random numbers and fix some bugs. (*) fbeb1a9... Implement IANY, IALL and IPARITY for unsigned. (*) 1762b7f... options: Regenerate c.opt.urls (*) 5e918a4... Implement SUM and PRODUCT for unsigned. (*) 5d98fe0... Implement MATMUL and DOT_PRODUCT for unsigned. (*) 650e915... c++: Implement C++23 P2718R0 - Wording for P2644R1 Fix for (*) d9cafa0... libgcc, Darwin: Drop the legacy library build for macOS >= (*) dab4500... i386: Fix comment typo (*) ae57e52... c++/contracts: ICE in build_contract_condition_function [PR (*) 4cb20dc... libgomp: with USM, init 'link' variables with host address (*) 79a3d3d... [PATCH] RISC-V: Fix FIXED_REGISTERS comment missing return (*) 96246bf... OpenMP: Check additional restrictions on context selector p (*) 2114243... Simplify range-op shift mask generation (*) de6fe69... Widening-Mul: Fix one ICE for SAT_SUB matching operand chec (*) cef2993... tree-optimization/116819 - SLP with !STMT_VINFO_RELEVANT re (*) 4bd3cca... RISC-V: testsuite: Fix SELECT_VL SLP fallout. (*) be50c76... RISC-V: Add more vector-vector extract cases. (*) e45537f... RISC-V: Fix effective target check. (*) 0c0d79c... Fortran: Allow to nullify caf token when not in ultimate co (*) 2249c3b... build: enable C++11 narrowing warnings (*) f5035d7... Fortran: Assign allocated caf-memory to scalar members [PR8 (*) 9a795b3... tree-optimization/114855 - more update_ss
[gcc/aoliva/heads/testbase] (212 commits) x86: Extend AVX512 Vectorization for Popcount in Various Mo
The branch 'aoliva/heads/testbase' was updated to point to: 85910e650a61... x86: Extend AVX512 Vectorization for Popcount in Various Mo It previously pointed to: d6d8445c8550... c++: fix constexpr cast from void* diag issue [PR116741] Diff: Summary of changes (added commits): --- 85910e6... x86: Extend AVX512 Vectorization for Popcount in Various Mo (*) 78eef89... Define VECTOR_STORE_FLAG_VALUE (*) 064d5c6... testsuite: Fix testcase g++.dg/modules/indirect-1_b.C [PR11 (*) 12c8cb8... RISC-V: Add testcases for form 3 of signed vector SAT_ADD (*) 342221f... Match: Support form 3 for vector signed integer .SAT_ADD (*) 9d76276... Daily bump. (*) 14cd108... gfortran testsuite: Remove unit-files in files having open- (*) 6fee826... testsuite: XFAIL g++.dg/modules/indirect-1_b.C (*) d5864b9... testsuite: fix dejagnu typos with underscores (*) 0b953ce... doc: Remove @code wrapping of fortran option names [PR11680 (*) cc40795... i386: Add GENERIC and GIMPLE folders of __builtin_ia32_{min (*) c79cc30... x86: Don't use address override with segment regsiter (*) ed6dccd... ltmain.sh: allow more flags at link-time (*) 82d9727... libstdc++: testsuite: fix dg-bogus directive syntax (*) 3308e82... Fix testsuite failure on 32-bit targets. (*) d1e7f3a... Add an alternative testcase for PR 70740 (*) 6c5543d... match: Fix `a != 0 ? a * b : 0` patterns for things that tr (*) 7cf85d1... c++: Add testcase for DR 2874 (*) 0564d95... c++: Add testcase for DR 2836 (*) 340ef96... c++: Add testcase for DR 2728 (*) a88d6c6... match: Fix A || B not optimized to true when !B implies A [ (*) 0e095df... Speed up get_bitmask_from_range (*) 6efc770... Speed up wide_int_storage::operator=(wide_int_storage const (*) 1fea6f8... c++: use TARGET_EXPR accessors (*) 08b8341... match: Change (A * B) + (-C) to (B - C/A) * A, if C multipl (*) af8ff00... remove dominator recursion from reassoc (*) 9b76263... Remove recursion in simplify_control_stmt_condition_1 [PR11 (*) 63a598d... libstdc++: #ifdef out #pragma GCC system_header (*) 2407dbe... libstdc++: more #pragma diagnostic (*) 7ad17fe... Use tree view for find_always_executed_bbs result (*) fcff9c3... OpenMP: Update OMP_REQUIRES_TARGET_USED for declare_target (*) 5d87b98... RISC-V: Cleanup debug code for SAT_* testcases [NFC] (*) cc141b5... rtl-optimization/114855 - slow add_store_equivs in IRA (*) 0b2d3bf... Disable add_store_equivs when -fno-expensive-optimizations (*) caf3fe7... tree-optimization/114855 - slow VRP due to equiv oracle que (*) 5b652b0... RISC-V: Refine the testcase of vector SAT_TRUNC (*) 32bcca3... RISC-V: Refine the testcase of vector SAT_SUB (*) 043d607... RISC-V: Refine the testcase of vector SAT_ADD (*) 742d242... i386: Update the comment for mapxf option (*) 6935bdd... OpenMP: Fix testsuite failure on x86 with -m32 (*) 2d8392c... Daily bump. (*) 291e20e... Add random numbers and fix some bugs. (*) fbeb1a9... Implement IANY, IALL and IPARITY for unsigned. (*) 1762b7f... options: Regenerate c.opt.urls (*) 5e918a4... Implement SUM and PRODUCT for unsigned. (*) 5d98fe0... Implement MATMUL and DOT_PRODUCT for unsigned. (*) 650e915... c++: Implement C++23 P2718R0 - Wording for P2644R1 Fix for (*) d9cafa0... libgcc, Darwin: Drop the legacy library build for macOS >= (*) dab4500... i386: Fix comment typo (*) ae57e52... c++/contracts: ICE in build_contract_condition_function [PR (*) 4cb20dc... libgomp: with USM, init 'link' variables with host address (*) 79a3d3d... [PATCH] RISC-V: Fix FIXED_REGISTERS comment missing return (*) 96246bf... OpenMP: Check additional restrictions on context selector p (*) 2114243... Simplify range-op shift mask generation (*) de6fe69... Widening-Mul: Fix one ICE for SAT_SUB matching operand chec (*) cef2993... tree-optimization/116819 - SLP with !STMT_VINFO_RELEVANT re (*) 4bd3cca... RISC-V: testsuite: Fix SELECT_VL SLP fallout. (*) be50c76... RISC-V: Add more vector-vector extract cases. (*) e45537f... RISC-V: Fix effective target check. (*) 0c0d79c... Fortran: Allow to nullify caf token when not in ultimate co (*) 2249c3b... build: enable C++11 narrowing warnings (*) f5035d7... Fortran: Assign allocated caf-memory to scalar members [PR8 (*) 9a795b3... tree-optimization/114855 - more update_ssa speedup (*) 3436617... Alphabetize my entry in MAINTAINER's DCO list. (*) b752eed... OpenMP: Add support for 'self_maps' to the 'require' direct (*) 7e560ff... Testsuite, darwin: account for macOS 15 (*) f594008... tree-optimization/115372 - failed store-lanes in some cases (*) 618871f... libstdc++: Remove unnecessary 'static' from __is_specializa (*) f9dfe8d... tree-optimization/114855 - high update_ssa time (*) 824229e... hosthooks.h: Fix GCC_HOST_HOOKS_H typo (*) f5ee372... nvptx: Partial support for aliases to aliases. (*) 4d6fa5b... Daily bump. (*) 5ef52ec... modula2: Ad
[gcc(refs/users/aoliva/heads/testme)] adjust probs after modified ifcombine
https://gcc.gnu.org/g:f88b4c43a2a478f5a7bc9b1185a9f0eec519f9b4 commit f88b4c43a2a478f5a7bc9b1185a9f0eec519f9b4 Author: Alexandre Oliva Date: Thu Sep 19 06:43:22 2024 -0300 adjust probs after modified ifcombine Diff: --- gcc/fold-const.h | 10 + gcc/testsuite/gcc.dg/field-merge-7.c | 23 ++ gcc/tree-ssa-ifcombine.cc| 486 --- 3 files changed, 427 insertions(+), 92 deletions(-) diff --git a/gcc/fold-const.h b/gcc/fold-const.h index 3e3998b57b04..136764f5c7eb 100644 --- a/gcc/fold-const.h +++ b/gcc/fold-const.h @@ -258,6 +258,16 @@ extern void clear_type_padding_in_mask (tree, unsigned char *); extern bool clear_padding_type_may_have_padding_p (tree); extern bool arith_overflowed_p (enum tree_code, const_tree, const_tree, const_tree); +extern tree fold_truth_andor_maybe_separate (location_t loc, +enum tree_code code, +tree truth_type, +enum tree_code lcode, +tree ll_arg, +tree lr_arg, +enum tree_code rcode, +tree rl_arg, +tree rr_arg, +tree *separatep); /* Class used to compare gimple operands. */ diff --git a/gcc/testsuite/gcc.dg/field-merge-7.c b/gcc/testsuite/gcc.dg/field-merge-7.c new file mode 100644 index ..728a29b6fafa --- /dev/null +++ b/gcc/testsuite/gcc.dg/field-merge-7.c @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fdump-tree-ifcombine-details" } */ + +/* Check that the third compare won't be combined with the first one. */ + +struct s { + char a, b; + int p; +}; + +struct s a = { 0, 0, 0 }; +struct s b = { 0, 0, 0 }; + +int f () { + return (a.a != b.a || (a.p != b.p && a.b != b.b)); +} + +int g () { + return (a.a == b.a && (a.p == b.p || a.b == b.b)); +} + +/* { dg-final { scan-tree-dump-not "optimizing" "ifcombine" } } */ +/* { dg-final { scan-tree-dump-not "BIT_FIELD_REF" "ifcombine" } } */ diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc index 79ccc70b2678..33e30d9f4f58 100644 --- a/gcc/tree-ssa-ifcombine.cc +++ b/gcc/tree-ssa-ifcombine.cc @@ -42,6 +42,7 @@ along with GCC; see the file COPYING3. If not see #include "tree-ssa.h" #include "attribs.h" #include "asan.h" +#include "bitmap.h" #ifndef LOGICAL_OP_NON_SHORT_CIRCUIT #define LOGICAL_OP_NON_SHORT_CIRCUIT \ @@ -49,6 +50,28 @@ along with GCC; see the file COPYING3. If not see false) >= 2) #endif +/* Return TRUE iff COND is NULL, or the condition in it is constant. */ + +static bool +constant_condition_p (gcond *cond) +{ + if (!cond) +return true; + + return (CONSTANT_CLASS_P (gimple_cond_lhs (cond)) + && CONSTANT_CLASS_P (gimple_cond_rhs (cond))); +} + +/* Return FALSE iff the condition in the COND stmt that ends COND_BB is not + constant. */ + +static bool +constant_condition_p (basic_block cond_bb) +{ + gcond *cond = safe_dyn_cast (*gsi_last_bb (cond_bb)); + return constant_condition_p (cond); +} + /* This pass combines COND_EXPRs to simplify control flow. It currently recognizes bit tests and comparisons in chains that represent logical and or logical or of two COND_EXPRs. @@ -107,15 +130,38 @@ recognize_if_then_else (basic_block cond_bb, if (!*else_bb) *else_bb = e->dest; - gcond *cond = safe_dyn_cast (*gsi_last_bb (cond_bb)); - if (!cond) -return false; + return true; +} + +/* Same as recognize_if_then_else, but check that the condition is not + constant. It is not useful to combine constant conditions. */ - if (CONSTANT_CLASS_P (gimple_cond_lhs (cond)) - && CONSTANT_CLASS_P (gimple_cond_rhs (cond))) +static bool +recognize_if_then_else_nc (basic_block cond_bb, + basic_block *then_bb, basic_block *else_bb) +{ + return recognize_if_then_else (cond_bb, then_bb, else_bb) +&& !constant_condition_p (cond_bb); +} + +/* Same as recognize_if_then_else, but don't associate the blocks with then and + else, check both possibilities. */ + +static bool +recognize_if_succs (basic_block cond_bb, + basic_block succ1, basic_block succ2) +{ + basic_block t, e; + + if (EDGE_COUNT (cond_bb->succs) != 2) return false; - return true; + /* Find both succs. */ + t = EDGE_SUCC (cond_bb, 0)->dest; + e = EDGE_SUCC (cond_bb, 1)->dest; + + return ((t == succ1 && e == succ2) + || (t == succ2 && e == succ1)); } /* Verify if the basic block BB does not have side-effects. Return @@ -364,14 +410,28 @@ recognize_bits_test (gcond *cond, tree *name, tree *bits, bool inv) } -/* Update profile after code in outer_cond_bb was adjusted so - out
[gcc/aoliva/heads/testme] adjust probs after modified ifcombine
The branch 'aoliva/heads/testme' was updated to point to: f88b4c43a2a4... adjust probs after modified ifcombine It previously pointed to: 7660c0400ee5... adjust probs after modified ifcombine Diff: !!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST): --- 7660c04... adjust probs after modified ifcombine Summary of changes (added commits): --- f88b4c4... adjust probs after modified ifcombine
[gcc(refs/users/aoliva/heads/testme)] adjust probs after modified ifcombine
https://gcc.gnu.org/g:7660c0400ee5c9207397d3936be695af962d455e commit 7660c0400ee5c9207397d3936be695af962d455e Author: Alexandre Oliva Date: Thu Sep 19 06:43:22 2024 -0300 adjust probs after modified ifcombine Diff: --- gcc/fold-const.h | 10 + gcc/testsuite/gcc.dg/field-merge-7.c | 23 ++ gcc/tree-ssa-ifcombine.cc| 482 --- 3 files changed, 423 insertions(+), 92 deletions(-) diff --git a/gcc/fold-const.h b/gcc/fold-const.h index 3e3998b57b04..136764f5c7eb 100644 --- a/gcc/fold-const.h +++ b/gcc/fold-const.h @@ -258,6 +258,16 @@ extern void clear_type_padding_in_mask (tree, unsigned char *); extern bool clear_padding_type_may_have_padding_p (tree); extern bool arith_overflowed_p (enum tree_code, const_tree, const_tree, const_tree); +extern tree fold_truth_andor_maybe_separate (location_t loc, +enum tree_code code, +tree truth_type, +enum tree_code lcode, +tree ll_arg, +tree lr_arg, +enum tree_code rcode, +tree rl_arg, +tree rr_arg, +tree *separatep); /* Class used to compare gimple operands. */ diff --git a/gcc/testsuite/gcc.dg/field-merge-7.c b/gcc/testsuite/gcc.dg/field-merge-7.c new file mode 100644 index ..b0e953c01e91 --- /dev/null +++ b/gcc/testsuite/gcc.dg/field-merge-7.c @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fdump-tree-ifcombine-details" } */ + +/* Check that the third compare won't be combined with the first one. */ + +struct s { + char a, b; + int p; +}; + +struct s a = { 0, 0, 0 }; +struct s b = { 0, 0, 0 }; + +int f () { + return (a.a != b.a || (a.p != b.p && a.b != b.b)); +} + +int g () { + return (a.a == b.a && (a.p == b.p || a.b == b.b)); +} + +/* { dg-do { scan-tree-dump-not "optimizing" "ifcombine" } } */ +/* { dg-do { scan-tree-dump-not "BIT_FIELD_REF" "ifcombine" } } */ diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc index 79ccc70b2678..9bb250bf2993 100644 --- a/gcc/tree-ssa-ifcombine.cc +++ b/gcc/tree-ssa-ifcombine.cc @@ -42,6 +42,7 @@ along with GCC; see the file COPYING3. If not see #include "tree-ssa.h" #include "attribs.h" #include "asan.h" +#include "bitmap.h" #ifndef LOGICAL_OP_NON_SHORT_CIRCUIT #define LOGICAL_OP_NON_SHORT_CIRCUIT \ @@ -49,6 +50,28 @@ along with GCC; see the file COPYING3. If not see false) >= 2) #endif +/* Return TRUE iff COND is NULL, or the condition in it is constant. */ + +static bool +constant_condition_p (gcond *cond) +{ + if (!cond) +return true; + + return (CONSTANT_CLASS_P (gimple_cond_lhs (cond)) + && CONSTANT_CLASS_P (gimple_cond_rhs (cond))); +} + +/* Return FALSE iff the condition in the COND stmt that ends COND_BB is not + constant. */ + +static bool +constant_condition_p (basic_block cond_bb) +{ + gcond *cond = safe_dyn_cast (*gsi_last_bb (cond_bb)); + return constant_condition_p (cond); +} + /* This pass combines COND_EXPRs to simplify control flow. It currently recognizes bit tests and comparisons in chains that represent logical and or logical or of two COND_EXPRs. @@ -107,15 +130,38 @@ recognize_if_then_else (basic_block cond_bb, if (!*else_bb) *else_bb = e->dest; - gcond *cond = safe_dyn_cast (*gsi_last_bb (cond_bb)); - if (!cond) -return false; + return true; +} + +/* Same as recognize_if_then_else, but check that the condition is not + constant. It is not useful to combine constant conditions. */ - if (CONSTANT_CLASS_P (gimple_cond_lhs (cond)) - && CONSTANT_CLASS_P (gimple_cond_rhs (cond))) +static bool +recognize_if_then_else_nc (basic_block cond_bb, + basic_block *then_bb, basic_block *else_bb) +{ + return recognize_if_then_else (cond_bb, then_bb, else_bb) +&& !constant_condition_p (cond_bb); +} + +/* Same as recognize_if_then_else, but don't associate the blocks with then and + else, check both possibilities. */ + +static bool +recognize_if_succs (basic_block cond_bb, + basic_block succ1, basic_block succ2) +{ + basic_block t, e; + + if (EDGE_COUNT (cond_bb->succs) != 2) return false; - return true; + /* Find both succs. */ + t = EDGE_SUCC (cond_bb, 0)->dest; + e = EDGE_SUCC (cond_bb, 1)->dest; + + return ((t == succ1 && e == succ2) + || (t == succ2 && e == succ1)); } /* Verify if the basic block BB does not have side-effects. Return @@ -364,14 +410,27 @@ recognize_bits_test (gcond *cond, tree *name, tree *bits, bool inv) } -/* Update profile after code in outer_cond_bb was adjusted so - outer_con
[gcc/aoliva/heads/testme] adjust probs after modified ifcombine
The branch 'aoliva/heads/testme' was updated to point to: 7660c0400ee5... adjust probs after modified ifcombine It previously pointed to: 1d6ef2e03dff... adjust probs after modified ifcombine Diff: !!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST): --- 1d6ef2e... adjust probs after modified ifcombine Summary of changes (added commits): --- 7660c04... adjust probs after modified ifcombine
[gcc(refs/users/aoliva/heads/testme)] adjust probs after modified ifcombine
https://gcc.gnu.org/g:1d6ef2e03dff76f12ead4aceaf662d2e350d2678 commit 1d6ef2e03dff76f12ead4aceaf662d2e350d2678 Author: Alexandre Oliva Date: Thu Sep 19 06:43:22 2024 -0300 adjust probs after modified ifcombine Diff: --- gcc/fold-const.h | 10 + gcc/testsuite/gcc.dg/field-merge-7.c | 19 ++ gcc/tree-ssa-ifcombine.cc| 482 --- 3 files changed, 419 insertions(+), 92 deletions(-) diff --git a/gcc/fold-const.h b/gcc/fold-const.h index 3e3998b57b04..136764f5c7eb 100644 --- a/gcc/fold-const.h +++ b/gcc/fold-const.h @@ -258,6 +258,16 @@ extern void clear_type_padding_in_mask (tree, unsigned char *); extern bool clear_padding_type_may_have_padding_p (tree); extern bool arith_overflowed_p (enum tree_code, const_tree, const_tree, const_tree); +extern tree fold_truth_andor_maybe_separate (location_t loc, +enum tree_code code, +tree truth_type, +enum tree_code lcode, +tree ll_arg, +tree lr_arg, +enum tree_code rcode, +tree rl_arg, +tree rr_arg, +tree *separatep); /* Class used to compare gimple operands. */ diff --git a/gcc/testsuite/gcc.dg/field-merge-7.c b/gcc/testsuite/gcc.dg/field-merge-7.c new file mode 100644 index ..16a06286d823 --- /dev/null +++ b/gcc/testsuite/gcc.dg/field-merge-7.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fdump-tree-ifcombine-details" } */ + +/* Check that the third compare won't be combined with the first one. */ + +struct s { + char a, b; + int p; +}; + +struct s a = { 0, 0, 0 }; +struct s b = { 0, 0, 0 }; + +int f () { + return (a.a != b.a && (a.p != b.p || a.b != b.b)); +} + +/* { dg-do { scan-tree-dump-not "optimizing" "ifcombine" } } */ +/* { dg-do { scan-tree-dump-not "BIT_FIELD_REF" "ifcombine" } } */ diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc index 79ccc70b2678..9bb250bf2993 100644 --- a/gcc/tree-ssa-ifcombine.cc +++ b/gcc/tree-ssa-ifcombine.cc @@ -42,6 +42,7 @@ along with GCC; see the file COPYING3. If not see #include "tree-ssa.h" #include "attribs.h" #include "asan.h" +#include "bitmap.h" #ifndef LOGICAL_OP_NON_SHORT_CIRCUIT #define LOGICAL_OP_NON_SHORT_CIRCUIT \ @@ -49,6 +50,28 @@ along with GCC; see the file COPYING3. If not see false) >= 2) #endif +/* Return TRUE iff COND is NULL, or the condition in it is constant. */ + +static bool +constant_condition_p (gcond *cond) +{ + if (!cond) +return true; + + return (CONSTANT_CLASS_P (gimple_cond_lhs (cond)) + && CONSTANT_CLASS_P (gimple_cond_rhs (cond))); +} + +/* Return FALSE iff the condition in the COND stmt that ends COND_BB is not + constant. */ + +static bool +constant_condition_p (basic_block cond_bb) +{ + gcond *cond = safe_dyn_cast (*gsi_last_bb (cond_bb)); + return constant_condition_p (cond); +} + /* This pass combines COND_EXPRs to simplify control flow. It currently recognizes bit tests and comparisons in chains that represent logical and or logical or of two COND_EXPRs. @@ -107,15 +130,38 @@ recognize_if_then_else (basic_block cond_bb, if (!*else_bb) *else_bb = e->dest; - gcond *cond = safe_dyn_cast (*gsi_last_bb (cond_bb)); - if (!cond) -return false; + return true; +} + +/* Same as recognize_if_then_else, but check that the condition is not + constant. It is not useful to combine constant conditions. */ - if (CONSTANT_CLASS_P (gimple_cond_lhs (cond)) - && CONSTANT_CLASS_P (gimple_cond_rhs (cond))) +static bool +recognize_if_then_else_nc (basic_block cond_bb, + basic_block *then_bb, basic_block *else_bb) +{ + return recognize_if_then_else (cond_bb, then_bb, else_bb) +&& !constant_condition_p (cond_bb); +} + +/* Same as recognize_if_then_else, but don't associate the blocks with then and + else, check both possibilities. */ + +static bool +recognize_if_succs (basic_block cond_bb, + basic_block succ1, basic_block succ2) +{ + basic_block t, e; + + if (EDGE_COUNT (cond_bb->succs) != 2) return false; - return true; + /* Find both succs. */ + t = EDGE_SUCC (cond_bb, 0)->dest; + e = EDGE_SUCC (cond_bb, 1)->dest; + + return ((t == succ1 && e == succ2) + || (t == succ2 && e == succ1)); } /* Verify if the basic block BB does not have side-effects. Return @@ -364,14 +410,27 @@ recognize_bits_test (gcond *cond, tree *name, tree *bits, bool inv) } -/* Update profile after code in outer_cond_bb was adjusted so - outer_cond_bb has no condition. */ +/* Update profile after code in either oute
[gcc/aoliva/heads/testme] adjust probs after modified ifcombine
The branch 'aoliva/heads/testme' was updated to point to: 1d6ef2e03dff... adjust probs after modified ifcombine It previously pointed to: 3bd0565cf130... adjust probs after modified ifcombine Diff: !!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST): --- 3bd0565... adjust probs after modified ifcombine Summary of changes (added commits): --- 1d6ef2e... adjust probs after modified ifcombine
[gcc(refs/users/aoliva/heads/testme)] adjust probs after modified ifcombine
https://gcc.gnu.org/g:3bd0565cf130cb126fea3287d144031ec831 commit 3bd0565cf130cb126fea3287d144031ec831 Author: Alexandre Oliva Date: Thu Sep 19 06:43:22 2024 -0300 adjust probs after modified ifcombine Diff: --- gcc/fold-const.h | 10 ++ gcc/tree-ssa-ifcombine.cc | 416 -- 2 files changed, 335 insertions(+), 91 deletions(-) diff --git a/gcc/fold-const.h b/gcc/fold-const.h index 3e3998b57b04..136764f5c7eb 100644 --- a/gcc/fold-const.h +++ b/gcc/fold-const.h @@ -258,6 +258,16 @@ extern void clear_type_padding_in_mask (tree, unsigned char *); extern bool clear_padding_type_may_have_padding_p (tree); extern bool arith_overflowed_p (enum tree_code, const_tree, const_tree, const_tree); +extern tree fold_truth_andor_maybe_separate (location_t loc, +enum tree_code code, +tree truth_type, +enum tree_code lcode, +tree ll_arg, +tree lr_arg, +enum tree_code rcode, +tree rl_arg, +tree rr_arg, +tree *separatep); /* Class used to compare gimple operands. */ diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc index 79ccc70b2678..8258df8dbaf3 100644 --- a/gcc/tree-ssa-ifcombine.cc +++ b/gcc/tree-ssa-ifcombine.cc @@ -42,6 +42,7 @@ along with GCC; see the file COPYING3. If not see #include "tree-ssa.h" #include "attribs.h" #include "asan.h" +#include "bitmap.h" #ifndef LOGICAL_OP_NON_SHORT_CIRCUIT #define LOGICAL_OP_NON_SHORT_CIRCUIT \ @@ -49,6 +50,28 @@ along with GCC; see the file COPYING3. If not see false) >= 2) #endif +/* Return TRUE iff COND is NULL, or the condition in it is constant. */ + +static bool +constant_condition_p (gcond *cond) +{ + if (!cond) +return true; + + return (CONSTANT_CLASS_P (gimple_cond_lhs (cond)) + && CONSTANT_CLASS_P (gimple_cond_rhs (cond))); +} + +/* Return FALSE iff the condition in the COND stmt that ends COND_BB is not + constant. */ + +static bool +constant_condition_p (basic_block cond_bb) +{ + gcond *cond = safe_dyn_cast (*gsi_last_bb (cond_bb)); + return constant_condition_p (cond); +} + /* This pass combines COND_EXPRs to simplify control flow. It currently recognizes bit tests and comparisons in chains that represent logical and or logical or of two COND_EXPRs. @@ -107,12 +130,7 @@ recognize_if_then_else (basic_block cond_bb, if (!*else_bb) *else_bb = e->dest; - gcond *cond = safe_dyn_cast (*gsi_last_bb (cond_bb)); - if (!cond) -return false; - - if (CONSTANT_CLASS_P (gimple_cond_lhs (cond)) - && CONSTANT_CLASS_P (gimple_cond_rhs (cond))) + if (constant_condition_p (cond_bb)) return false; return true; @@ -364,14 +382,27 @@ recognize_bits_test (gcond *cond, tree *name, tree *bits, bool inv) } -/* Update profile after code in outer_cond_bb was adjusted so - outer_cond_bb has no condition. */ +/* Update profile after code in either outer_cond_bb or inner_cond_bb was + adjusted so that it has no condition. */ static void update_profile_after_ifcombine (basic_block inner_cond_bb, basic_block outer_cond_bb) { - edge outer_to_inner = find_edge (outer_cond_bb, inner_cond_bb); + /* In the following we assume that inner_cond_bb has single predecessor. */ + gcc_assert (single_pred_p (inner_cond_bb)); + + basic_block outer_to_inner_bb = inner_cond_bb; + profile_probability prob = profile_probability::always (); + for (basic_block parent = single_pred (outer_to_inner_bb); + parent != outer_cond_bb; + parent = single_pred (outer_to_inner_bb)) +{ + prob *= find_edge (parent, outer_to_inner_bb)->probability; + outer_to_inner_bb = parent; +} + + edge outer_to_inner = find_edge (outer_cond_bb, outer_to_inner_bb); edge outer2 = (EDGE_SUCC (outer_cond_bb, 0) == outer_to_inner ? EDGE_SUCC (outer_cond_bb, 1) : EDGE_SUCC (outer_cond_bb, 0)); @@ -382,38 +413,98 @@ update_profile_after_ifcombine (basic_block inner_cond_bb, std::swap (inner_taken, inner_not_taken); gcc_assert (inner_taken->dest == outer2->dest); - /* In the following we assume that inner_cond_bb has single predecessor. */ - gcc_assert (single_pred_p (inner_cond_bb)); - - /* Path outer_cond_bb->(outer2) needs to be merged into path - outer_cond_bb->(outer_to_inner)->inner_cond_bb->(inner_taken) - and probability of inner_not_taken updated. */ - - inner_cond_bb->count = outer_cond_bb->count; + if (outer_to_inner_bb == inner_cond_bb + && constant_condition_p (outer_cond_bb)) +{ +
[gcc/aoliva/heads/testme] adjust probs after modified ifcombine
The branch 'aoliva/heads/testme' was updated to point to: 3bd0565cf130... adjust probs after modified ifcombine It previously pointed to: f8372d9ec46d... adjust probs after modified ifcombine Diff: !!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST): --- f8372d9... adjust probs after modified ifcombine Summary of changes (added commits): --- 3bd0565... adjust probs after modified ifcombine
[gcc(refs/users/aoliva/heads/testme)] adjust probs after modified ifcombine
https://gcc.gnu.org/g:f8372d9ec46d74dc9e5f273c0a7bb2a9ad65578d commit f8372d9ec46d74dc9e5f273c0a7bb2a9ad65578d Author: Alexandre Oliva Date: Thu Sep 19 06:43:22 2024 -0300 adjust probs after modified ifcombine Diff: --- gcc/tree-ssa-ifcombine.cc | 409 -- 1 file changed, 326 insertions(+), 83 deletions(-) diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc index 79ccc70b2678..f058120a93c0 100644 --- a/gcc/tree-ssa-ifcombine.cc +++ b/gcc/tree-ssa-ifcombine.cc @@ -42,6 +42,7 @@ along with GCC; see the file COPYING3. If not see #include "tree-ssa.h" #include "attribs.h" #include "asan.h" +#include "bitmap.h" #ifndef LOGICAL_OP_NON_SHORT_CIRCUIT #define LOGICAL_OP_NON_SHORT_CIRCUIT \ @@ -49,6 +50,28 @@ along with GCC; see the file COPYING3. If not see false) >= 2) #endif +/* Return TRUE iff COND is NULL, or the condition in it is constant. */ + +static bool +constant_condition_p (gcond *cond) +{ + if (!cond) +return true; + + return (CONSTANT_CLASS_P (gimple_cond_lhs (cond)) + && CONSTANT_CLASS_P (gimple_cond_rhs (cond))); +} + +/* Return FALSE iff the condition in the COND stmt that ends COND_BB is not + constant. */ + +static bool +constant_condition_p (basic_block cond_bb) +{ + gcond *cond = safe_dyn_cast (*gsi_last_bb (cond_bb)); + return constant_condition_p (cond); +} + /* This pass combines COND_EXPRs to simplify control flow. It currently recognizes bit tests and comparisons in chains that represent logical and or logical or of two COND_EXPRs. @@ -107,12 +130,7 @@ recognize_if_then_else (basic_block cond_bb, if (!*else_bb) *else_bb = e->dest; - gcond *cond = safe_dyn_cast (*gsi_last_bb (cond_bb)); - if (!cond) -return false; - - if (CONSTANT_CLASS_P (gimple_cond_lhs (cond)) - && CONSTANT_CLASS_P (gimple_cond_rhs (cond))) + if (constant_condition_p (cond_bb)) return false; return true; @@ -364,14 +382,27 @@ recognize_bits_test (gcond *cond, tree *name, tree *bits, bool inv) } -/* Update profile after code in outer_cond_bb was adjusted so - outer_cond_bb has no condition. */ +/* Update profile after code in either outer_cond_bb or inner_cond_bb was + adjusted so that it has no condition. */ static void update_profile_after_ifcombine (basic_block inner_cond_bb, basic_block outer_cond_bb) { - edge outer_to_inner = find_edge (outer_cond_bb, inner_cond_bb); + /* In the following we assume that inner_cond_bb has single predecessor. */ + gcc_assert (single_pred_p (inner_cond_bb)); + + basic_block outer_to_inner_bb = inner_cond_bb; + profile_probability prob = profile_probability::always (); + for (basic_block parent = single_pred (outer_to_inner_bb); + parent != outer_cond_bb; + parent = single_pred (outer_to_inner_bb)) +{ + prob *= find_edge (parent, outer_to_inner_bb)->probability; + outer_to_inner_bb = parent; +} + + edge outer_to_inner = find_edge (outer_cond_bb, outer_to_inner_bb); edge outer2 = (EDGE_SUCC (outer_cond_bb, 0) == outer_to_inner ? EDGE_SUCC (outer_cond_bb, 1) : EDGE_SUCC (outer_cond_bb, 0)); @@ -382,29 +413,62 @@ update_profile_after_ifcombine (basic_block inner_cond_bb, std::swap (inner_taken, inner_not_taken); gcc_assert (inner_taken->dest == outer2->dest); - /* In the following we assume that inner_cond_bb has single predecessor. */ - gcc_assert (single_pred_p (inner_cond_bb)); + if (constant_condition_p (outer_cond_bb)) +{ + gcc_checking_assert (outer_to_inner_bb == inner_cond_bb); - /* Path outer_cond_bb->(outer2) needs to be merged into path - outer_cond_bb->(outer_to_inner)->inner_cond_bb->(inner_taken) - and probability of inner_not_taken updated. */ + /* Path outer_cond_bb->(outer2) needs to be merged into path +outer_cond_bb->(outer_to_inner)->inner_cond_bb->(inner_taken) +and probability of inner_not_taken updated. */ - inner_cond_bb->count = outer_cond_bb->count; + inner_cond_bb->count = outer_cond_bb->count; - /* Handle special case where inner_taken probability is always. In this case - we know that the overall outcome will be always as well, but combining - probabilities will be conservative because it does not know that - outer2->probability is inverse of outer_to_inner->probability. */ - if (inner_taken->probability == profile_probability::always ()) -; - else -inner_taken->probability = outer2->probability + outer_to_inner->probability - * inner_taken->probability; - inner_not_taken->probability = profile_probability::always () -- inner_taken->probability; + /* Handle special case where inner_taken probability is always. In this +case we know that the overall outcome will be always as well, but +combining probabilities
[gcc/aoliva/heads/testme] adjust probs after modified ifcombine
The branch 'aoliva/heads/testme' was updated to point to: f8372d9ec46d... adjust probs after modified ifcombine It previously pointed to: cd77751018e2... adjust probs after modified ifcombine Diff: !!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST): --- cd77751... adjust probs after modified ifcombine Summary of changes (added commits): --- f8372d9... adjust probs after modified ifcombine
[gcc(refs/users/aoliva/heads/testme)] adjust probs after modified ifcombine
https://gcc.gnu.org/g:cd77751018e24e7288b109978d41e0002410c49d commit cd77751018e24e7288b109978d41e0002410c49d Author: Alexandre Oliva Date: Thu Sep 19 06:43:22 2024 -0300 adjust probs after modified ifcombine Diff: --- gcc/tree-ssa-ifcombine.cc | 403 +- 1 file changed, 323 insertions(+), 80 deletions(-) diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc index 79ccc70b2678..425fbae46eac 100644 --- a/gcc/tree-ssa-ifcombine.cc +++ b/gcc/tree-ssa-ifcombine.cc @@ -42,6 +42,7 @@ along with GCC; see the file COPYING3. If not see #include "tree-ssa.h" #include "attribs.h" #include "asan.h" +#include "bitmap.h" #ifndef LOGICAL_OP_NON_SHORT_CIRCUIT #define LOGICAL_OP_NON_SHORT_CIRCUIT \ @@ -49,6 +50,28 @@ along with GCC; see the file COPYING3. If not see false) >= 2) #endif +/* Return TRUE iff COND is NULL, or the condition in it is constant. */ + +static bool +constant_condition_p (gcond *cond) +{ + if (!cond) +return true; + + return (CONSTANT_CLASS_P (gimple_cond_lhs (cond)) + && CONSTANT_CLASS_P (gimple_cond_rhs (cond))); +} + +/* Return FALSE iff the condition in the COND stmt that ends COND_BB is not + constant. */ + +static bool +constant_condition_p (basic_block cond_bb) +{ + gcond *cond = safe_dyn_cast (*gsi_last_bb (cond_bb)); + return constant_condition_p (cond); +} + /* This pass combines COND_EXPRs to simplify control flow. It currently recognizes bit tests and comparisons in chains that represent logical and or logical or of two COND_EXPRs. @@ -107,12 +130,7 @@ recognize_if_then_else (basic_block cond_bb, if (!*else_bb) *else_bb = e->dest; - gcond *cond = safe_dyn_cast (*gsi_last_bb (cond_bb)); - if (!cond) -return false; - - if (CONSTANT_CLASS_P (gimple_cond_lhs (cond)) - && CONSTANT_CLASS_P (gimple_cond_rhs (cond))) + if (constant_condition_p (cond_bb)) return false; return true; @@ -364,14 +382,27 @@ recognize_bits_test (gcond *cond, tree *name, tree *bits, bool inv) } -/* Update profile after code in outer_cond_bb was adjusted so - outer_cond_bb has no condition. */ +/* Update profile after code in either outer_cond_bb or inner_cond_bb was + adjusted so that it has no condition. */ static void update_profile_after_ifcombine (basic_block inner_cond_bb, basic_block outer_cond_bb) { - edge outer_to_inner = find_edge (outer_cond_bb, inner_cond_bb); + /* In the following we assume that inner_cond_bb has single predecessor. */ + gcc_assert (single_pred_p (inner_cond_bb)); + + basic_block outer_to_inner_bb = inner_cond_bb; + profile_probability prob = profile_probability::always (); + for (basic_block parent = single_pred (outer_to_inner_bb); + parent != outer_cond_bb; + parent = single_pred (outer_to_inner_bb)) +{ + prob *= find_edge (parent, outer_to_inner_bb)->probability; + outer_to_inner_bb = parent; +} + + edge outer_to_inner = find_edge (outer_cond_bb, outer_to_inner_bb); edge outer2 = (EDGE_SUCC (outer_cond_bb, 0) == outer_to_inner ? EDGE_SUCC (outer_cond_bb, 1) : EDGE_SUCC (outer_cond_bb, 0)); @@ -382,29 +413,62 @@ update_profile_after_ifcombine (basic_block inner_cond_bb, std::swap (inner_taken, inner_not_taken); gcc_assert (inner_taken->dest == outer2->dest); - /* In the following we assume that inner_cond_bb has single predecessor. */ - gcc_assert (single_pred_p (inner_cond_bb)); + if (constant_condition_p (outer_cond_bb)) +{ + gcc_checking_assert (outer_to_inner_bb == inner_cond_bb); - /* Path outer_cond_bb->(outer2) needs to be merged into path - outer_cond_bb->(outer_to_inner)->inner_cond_bb->(inner_taken) - and probability of inner_not_taken updated. */ + /* Path outer_cond_bb->(outer2) needs to be merged into path +outer_cond_bb->(outer_to_inner)->inner_cond_bb->(inner_taken) +and probability of inner_not_taken updated. */ - inner_cond_bb->count = outer_cond_bb->count; + inner_cond_bb->count = outer_cond_bb->count; - /* Handle special case where inner_taken probability is always. In this case - we know that the overall outcome will be always as well, but combining - probabilities will be conservative because it does not know that - outer2->probability is inverse of outer_to_inner->probability. */ - if (inner_taken->probability == profile_probability::always ()) -; - else -inner_taken->probability = outer2->probability + outer_to_inner->probability - * inner_taken->probability; - inner_not_taken->probability = profile_probability::always () -- inner_taken->probability; + /* Handle special case where inner_taken probability is always. In this +case we know that the overall outcome will be always as well, but +combining probabilities
[gcc/aoliva/heads/testme] adjust probs after modified ifcombine
The branch 'aoliva/heads/testme' was updated to point to: cd77751018e2... adjust probs after modified ifcombine It previously pointed to: 9919ba13180b... adjust probs after modified ifcombine Diff: !!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST): --- 9919ba1... adjust probs after modified ifcombine Summary of changes (added commits): --- cd77751... adjust probs after modified ifcombine
[gcc/aoliva/heads/testme] adjust probs after modified ifcombine
The branch 'aoliva/heads/testme' was updated to point to: 9919ba13180b... adjust probs after modified ifcombine It previously pointed to: 90e42ef87e8f... adjust probs after modified ifcombine Diff: !!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST): --- 90e42ef... adjust probs after modified ifcombine Summary of changes (added commits): --- 9919ba1... adjust probs after modified ifcombine
[gcc(refs/users/aoliva/heads/testme)] adjust probs after modified ifcombine
https://gcc.gnu.org/g:9919ba13180b34d127c45b97117bae0b9036dc13 commit 9919ba13180b34d127c45b97117bae0b9036dc13 Author: Alexandre Oliva Date: Thu Sep 19 06:43:22 2024 -0300 adjust probs after modified ifcombine Diff: --- gcc/tree-ssa-ifcombine.cc | 403 +- 1 file changed, 323 insertions(+), 80 deletions(-) diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc index 79ccc70b2678..79f89570acb6 100644 --- a/gcc/tree-ssa-ifcombine.cc +++ b/gcc/tree-ssa-ifcombine.cc @@ -42,6 +42,7 @@ along with GCC; see the file COPYING3. If not see #include "tree-ssa.h" #include "attribs.h" #include "asan.h" +#include "bitmap.h" #ifndef LOGICAL_OP_NON_SHORT_CIRCUIT #define LOGICAL_OP_NON_SHORT_CIRCUIT \ @@ -49,6 +50,28 @@ along with GCC; see the file COPYING3. If not see false) >= 2) #endif +/* Return TRUE iff COND is NULL, or the condition in it is constant. */ + +static bool +constant_condition_p (gcond *cond) +{ + if (!cond) +return true; + + return (CONSTANT_CLASS_P (gimple_cond_lhs (cond)) + && CONSTANT_CLASS_P (gimple_cond_rhs (cond))); +} + +/* Return FALSE iff the condition in the COND stmt that ends COND_BB is not + constant. */ + +static bool +constant_condition_p (basic_block cond_bb) +{ + gcond *cond = safe_dyn_cast (*gsi_last_bb (cond_bb)); + return constant_condition_p (cond); +} + /* This pass combines COND_EXPRs to simplify control flow. It currently recognizes bit tests and comparisons in chains that represent logical and or logical or of two COND_EXPRs. @@ -107,12 +130,7 @@ recognize_if_then_else (basic_block cond_bb, if (!*else_bb) *else_bb = e->dest; - gcond *cond = safe_dyn_cast (*gsi_last_bb (cond_bb)); - if (!cond) -return false; - - if (CONSTANT_CLASS_P (gimple_cond_lhs (cond)) - && CONSTANT_CLASS_P (gimple_cond_rhs (cond))) + if (constant_condition_p (cond_bb)) return false; return true; @@ -364,14 +382,27 @@ recognize_bits_test (gcond *cond, tree *name, tree *bits, bool inv) } -/* Update profile after code in outer_cond_bb was adjusted so - outer_cond_bb has no condition. */ +/* Update profile after code in either outer_cond_bb or inner_cond_bb was + adjusted so that it has no condition. */ static void update_profile_after_ifcombine (basic_block inner_cond_bb, basic_block outer_cond_bb) { - edge outer_to_inner = find_edge (outer_cond_bb, inner_cond_bb); + /* In the following we assume that inner_cond_bb has single predecessor. */ + gcc_assert (single_pred_p (inner_cond_bb)); + + basic_block outer_to_inner_bb = inner_cond_bb; + profile_probability prob = profile_probability::always (); + for (basic_block parent = single_pred (outer_to_inner_bb); + parent != outer_cond_bb; + parent = single_pred (outer_to_inner_bb)) +{ + prob *= find_edge (parent, outer_to_inner_bb)->probability; + outer_to_inner_bb = parent; +} + + edge outer_to_inner = find_edge (outer_cond_bb, outer_to_inner_bb); edge outer2 = (EDGE_SUCC (outer_cond_bb, 0) == outer_to_inner ? EDGE_SUCC (outer_cond_bb, 1) : EDGE_SUCC (outer_cond_bb, 0)); @@ -382,29 +413,62 @@ update_profile_after_ifcombine (basic_block inner_cond_bb, std::swap (inner_taken, inner_not_taken); gcc_assert (inner_taken->dest == outer2->dest); - /* In the following we assume that inner_cond_bb has single predecessor. */ - gcc_assert (single_pred_p (inner_cond_bb)); + if (constant_condition_p (outer_cond_bb)) +{ + gcc_checking_assert (outer_to_inner_bb == inner_cond_bb); - /* Path outer_cond_bb->(outer2) needs to be merged into path - outer_cond_bb->(outer_to_inner)->inner_cond_bb->(inner_taken) - and probability of inner_not_taken updated. */ + /* Path outer_cond_bb->(outer2) needs to be merged into path +outer_cond_bb->(outer_to_inner)->inner_cond_bb->(inner_taken) +and probability of inner_not_taken updated. */ - inner_cond_bb->count = outer_cond_bb->count; + inner_cond_bb->count = outer_cond_bb->count; - /* Handle special case where inner_taken probability is always. In this case - we know that the overall outcome will be always as well, but combining - probabilities will be conservative because it does not know that - outer2->probability is inverse of outer_to_inner->probability. */ - if (inner_taken->probability == profile_probability::always ()) -; - else -inner_taken->probability = outer2->probability + outer_to_inner->probability - * inner_taken->probability; - inner_not_taken->probability = profile_probability::always () -- inner_taken->probability; + /* Handle special case where inner_taken probability is always. In this +case we know that the overall outcome will be always as well, but +combining probabilities
[gcc(refs/users/aoliva/heads/testme)] adjust probs after modified ifcombine
https://gcc.gnu.org/g:90e42ef87e8f9e381043aea4a56b37249e6f4ced commit 90e42ef87e8f9e381043aea4a56b37249e6f4ced Author: Alexandre Oliva Date: Thu Sep 19 06:43:22 2024 -0300 adjust probs after modified ifcombine Diff: --- gcc/tree-ssa-ifcombine.cc | 355 +- 1 file changed, 289 insertions(+), 66 deletions(-) diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc index 79ccc70b2678..1998b8deb6d1 100644 --- a/gcc/tree-ssa-ifcombine.cc +++ b/gcc/tree-ssa-ifcombine.cc @@ -42,6 +42,7 @@ along with GCC; see the file COPYING3. If not see #include "tree-ssa.h" #include "attribs.h" #include "asan.h" +#include "bitmap.h" #ifndef LOGICAL_OP_NON_SHORT_CIRCUIT #define LOGICAL_OP_NON_SHORT_CIRCUIT \ @@ -49,6 +50,28 @@ along with GCC; see the file COPYING3. If not see false) >= 2) #endif +/* Return TRUE iff COND is NULL, or the condition in it is constant. */ + +static bool +constant_condition_p (gcond *cond) +{ + if (!cond) +return true; + + return (CONSTANT_CLASS_P (gimple_cond_lhs (cond)) + && CONSTANT_CLASS_P (gimple_cond_rhs (cond))); +} + +/* Return FALSE iff the condition in the COND stmt that ends COND_BB is not + constant. */ + +static bool +constant_condition_p (basic_block cond_bb) +{ + gcond *cond = safe_dyn_cast (*gsi_last_bb (cond_bb)); + return constant_condition_p (cond); +} + /* This pass combines COND_EXPRs to simplify control flow. It currently recognizes bit tests and comparisons in chains that represent logical and or logical or of two COND_EXPRs. @@ -107,12 +130,7 @@ recognize_if_then_else (basic_block cond_bb, if (!*else_bb) *else_bb = e->dest; - gcond *cond = safe_dyn_cast (*gsi_last_bb (cond_bb)); - if (!cond) -return false; - - if (CONSTANT_CLASS_P (gimple_cond_lhs (cond)) - && CONSTANT_CLASS_P (gimple_cond_rhs (cond))) + if (constant_condition_p (cond_bb)) return false; return true; @@ -364,14 +382,27 @@ recognize_bits_test (gcond *cond, tree *name, tree *bits, bool inv) } -/* Update profile after code in outer_cond_bb was adjusted so - outer_cond_bb has no condition. */ +/* Update profile after code in either outer_cond_bb or inner_cond_bb was + adjusted so that it has no condition. */ static void update_profile_after_ifcombine (basic_block inner_cond_bb, basic_block outer_cond_bb) { - edge outer_to_inner = find_edge (outer_cond_bb, inner_cond_bb); + /* In the following we assume that inner_cond_bb has single predecessor. */ + gcc_assert (single_pred_p (inner_cond_bb)); + + basic_block outer_to_inner_bb = inner_cond_bb; + profile_probability prob = profile_probability::always (); + for (basic_block parent = single_pred (outer_to_inner_bb); + parent != outer_cond_bb; + parent = single_pred (outer_to_inner_bb)) +{ + prob *= find_edge (parent, outer_to_inner_bb)->probability; + outer_to_inner_bb = parent; +} + + edge outer_to_inner = find_edge (outer_cond_bb, outer_to_inner_bb); edge outer2 = (EDGE_SUCC (outer_cond_bb, 0) == outer_to_inner ? EDGE_SUCC (outer_cond_bb, 1) : EDGE_SUCC (outer_cond_bb, 0)); @@ -382,29 +413,62 @@ update_profile_after_ifcombine (basic_block inner_cond_bb, std::swap (inner_taken, inner_not_taken); gcc_assert (inner_taken->dest == outer2->dest); - /* In the following we assume that inner_cond_bb has single predecessor. */ - gcc_assert (single_pred_p (inner_cond_bb)); + if (constant_condition_p (outer_cond_bb)) +{ + gcc_checking_assert (outer_to_inner_bb == inner_cond_bb); - /* Path outer_cond_bb->(outer2) needs to be merged into path - outer_cond_bb->(outer_to_inner)->inner_cond_bb->(inner_taken) - and probability of inner_not_taken updated. */ + /* Path outer_cond_bb->(outer2) needs to be merged into path +outer_cond_bb->(outer_to_inner)->inner_cond_bb->(inner_taken) +and probability of inner_not_taken updated. */ - inner_cond_bb->count = outer_cond_bb->count; + inner_cond_bb->count = outer_cond_bb->count; - /* Handle special case where inner_taken probability is always. In this case - we know that the overall outcome will be always as well, but combining - probabilities will be conservative because it does not know that - outer2->probability is inverse of outer_to_inner->probability. */ - if (inner_taken->probability == profile_probability::always ()) -; - else -inner_taken->probability = outer2->probability + outer_to_inner->probability - * inner_taken->probability; - inner_not_taken->probability = profile_probability::always () -- inner_taken->probability; + /* Handle special case where inner_taken probability is always. In this +case we know that the overall outcome will be always as well, but +combining probabilities
[gcc/aoliva/heads/testme] adjust probs after modified ifcombine
The branch 'aoliva/heads/testme' was updated to point to: 90e42ef87e8f... adjust probs after modified ifcombine It previously pointed to: cddcc2cb25ff... adjust probs after modified ifcombine Diff: !!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST): --- cddcc2c... adjust probs after modified ifcombine Summary of changes (added commits): --- 90e42ef... adjust probs after modified ifcombine
[gcc/aoliva/heads/testme] adjust probs after modified ifcombine
The branch 'aoliva/heads/testme' was updated to point to: cddcc2cb25ff... adjust probs after modified ifcombine It previously pointed to: 89ff1a069da3... adjust probs after modified ifcombine Diff: !!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST): --- 89ff1a0... adjust probs after modified ifcombine Summary of changes (added commits): --- cddcc2c... adjust probs after modified ifcombine
[gcc(refs/users/aoliva/heads/testme)] adjust probs after modified ifcombine
https://gcc.gnu.org/g:cddcc2cb25fffc4a119614c522bd6a21b436c97e commit cddcc2cb25fffc4a119614c522bd6a21b436c97e Author: Alexandre Oliva Date: Thu Sep 19 06:43:22 2024 -0300 adjust probs after modified ifcombine Diff: --- gcc/tree-ssa-ifcombine.cc | 355 +- 1 file changed, 289 insertions(+), 66 deletions(-) diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc index 79ccc70b2678..d08bd942643d 100644 --- a/gcc/tree-ssa-ifcombine.cc +++ b/gcc/tree-ssa-ifcombine.cc @@ -42,6 +42,7 @@ along with GCC; see the file COPYING3. If not see #include "tree-ssa.h" #include "attribs.h" #include "asan.h" +#include "bitmap.h" #ifndef LOGICAL_OP_NON_SHORT_CIRCUIT #define LOGICAL_OP_NON_SHORT_CIRCUIT \ @@ -49,6 +50,28 @@ along with GCC; see the file COPYING3. If not see false) >= 2) #endif +/* Return TRUE iff COND is NULL, or the condition in it is constant. */ + +static bool +constant_condition_p (gcond *cond) +{ + if (!cond) +return true; + + return (CONSTANT_CLASS_P (gimple_cond_lhs (cond)) + && CONSTANT_CLASS_P (gimple_cond_rhs (cond))); +} + +/* Return FALSE iff the condition in the COND stmt that ends COND_BB is not + constant. */ + +static bool +constant_condition_p (basic_block cond_bb) +{ + gcond *cond = safe_dyn_cast (*gsi_last_bb (cond_bb)); + return constant_condition_p (cond); +} + /* This pass combines COND_EXPRs to simplify control flow. It currently recognizes bit tests and comparisons in chains that represent logical and or logical or of two COND_EXPRs. @@ -107,12 +130,7 @@ recognize_if_then_else (basic_block cond_bb, if (!*else_bb) *else_bb = e->dest; - gcond *cond = safe_dyn_cast (*gsi_last_bb (cond_bb)); - if (!cond) -return false; - - if (CONSTANT_CLASS_P (gimple_cond_lhs (cond)) - && CONSTANT_CLASS_P (gimple_cond_rhs (cond))) + if (constant_condition_p (cond_bb)) return false; return true; @@ -364,14 +382,27 @@ recognize_bits_test (gcond *cond, tree *name, tree *bits, bool inv) } -/* Update profile after code in outer_cond_bb was adjusted so - outer_cond_bb has no condition. */ +/* Update profile after code in either outer_cond_bb or inner_cond_bb was + adjusted so that it has no condition. */ static void update_profile_after_ifcombine (basic_block inner_cond_bb, basic_block outer_cond_bb) { - edge outer_to_inner = find_edge (outer_cond_bb, inner_cond_bb); + /* In the following we assume that inner_cond_bb has single predecessor. */ + gcc_assert (single_pred_p (inner_cond_bb)); + + basic_block outer_to_inner_bb = inner_cond_bb; + profile_probability prob = profile_probability::always (); + for (basic_block parent = single_pred (outer_to_inner_bb); + parent != outer_cond_bb; + parent = single_pred (outer_to_inner_bb)) +{ + prob *= find_edge (parent, outer_to_inner_bb)->probability; + outer_to_inner_bb = parent; +} + + edge outer_to_inner = find_edge (outer_cond_bb, outer_to_inner_bb); edge outer2 = (EDGE_SUCC (outer_cond_bb, 0) == outer_to_inner ? EDGE_SUCC (outer_cond_bb, 1) : EDGE_SUCC (outer_cond_bb, 0)); @@ -382,29 +413,62 @@ update_profile_after_ifcombine (basic_block inner_cond_bb, std::swap (inner_taken, inner_not_taken); gcc_assert (inner_taken->dest == outer2->dest); - /* In the following we assume that inner_cond_bb has single predecessor. */ - gcc_assert (single_pred_p (inner_cond_bb)); + if (constant_condition_p (outer_cond_bb)) +{ + gcc_checking_assert (outer_to_inner_bb == inner_cond_bb); - /* Path outer_cond_bb->(outer2) needs to be merged into path - outer_cond_bb->(outer_to_inner)->inner_cond_bb->(inner_taken) - and probability of inner_not_taken updated. */ + /* Path outer_cond_bb->(outer2) needs to be merged into path +outer_cond_bb->(outer_to_inner)->inner_cond_bb->(inner_taken) +and probability of inner_not_taken updated. */ - inner_cond_bb->count = outer_cond_bb->count; + inner_cond_bb->count = outer_cond_bb->count; - /* Handle special case where inner_taken probability is always. In this case - we know that the overall outcome will be always as well, but combining - probabilities will be conservative because it does not know that - outer2->probability is inverse of outer_to_inner->probability. */ - if (inner_taken->probability == profile_probability::always ()) -; - else -inner_taken->probability = outer2->probability + outer_to_inner->probability - * inner_taken->probability; - inner_not_taken->probability = profile_probability::always () -- inner_taken->probability; + /* Handle special case where inner_taken probability is always. In this +case we know that the overall outcome will be always as well, but +combining probabilities
[gcc/aoliva/heads/testme] adjust probs after modified ifcombine
The branch 'aoliva/heads/testme' was updated to point to: 89ff1a069da3... adjust probs after modified ifcombine It previously pointed to: 3a8bc917de46... adjust probs after modified ifcombine Diff: !!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST): --- 3a8bc91... adjust probs after modified ifcombine Summary of changes (added commits): --- 89ff1a0... adjust probs after modified ifcombine
[gcc(refs/users/aoliva/heads/testme)] adjust probs after modified ifcombine
https://gcc.gnu.org/g:89ff1a069da3ca97aa80e3dbf9fa5d9b5e2e4061 commit 89ff1a069da3ca97aa80e3dbf9fa5d9b5e2e4061 Author: Alexandre Oliva Date: Thu Sep 19 06:43:22 2024 -0300 adjust probs after modified ifcombine Diff: --- gcc/tree-ssa-ifcombine.cc | 294 +++--- 1 file changed, 228 insertions(+), 66 deletions(-) diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc index 79ccc70b2678..0aa6871d0ab4 100644 --- a/gcc/tree-ssa-ifcombine.cc +++ b/gcc/tree-ssa-ifcombine.cc @@ -42,6 +42,7 @@ along with GCC; see the file COPYING3. If not see #include "tree-ssa.h" #include "attribs.h" #include "asan.h" +#include "bitmap.h" #ifndef LOGICAL_OP_NON_SHORT_CIRCUIT #define LOGICAL_OP_NON_SHORT_CIRCUIT \ @@ -49,6 +50,28 @@ along with GCC; see the file COPYING3. If not see false) >= 2) #endif +/* Return TRUE iff COND is NULL, or the condition in it is constant. */ + +static bool +constant_condition_p (gcond *cond) +{ + if (!cond) +return true; + + return (CONSTANT_CLASS_P (gimple_cond_lhs (cond)) + && CONSTANT_CLASS_P (gimple_cond_rhs (cond))); +} + +/* Return FALSE iff the condition in the COND stmt that ends COND_BB is not + constant. */ + +static bool +constant_condition_p (basic_block cond_bb) +{ + gcond *cond = safe_dyn_cast (*gsi_last_bb (cond_bb)); + return constant_condition_p (cond); +} + /* This pass combines COND_EXPRs to simplify control flow. It currently recognizes bit tests and comparisons in chains that represent logical and or logical or of two COND_EXPRs. @@ -107,12 +130,7 @@ recognize_if_then_else (basic_block cond_bb, if (!*else_bb) *else_bb = e->dest; - gcond *cond = safe_dyn_cast (*gsi_last_bb (cond_bb)); - if (!cond) -return false; - - if (CONSTANT_CLASS_P (gimple_cond_lhs (cond)) - && CONSTANT_CLASS_P (gimple_cond_rhs (cond))) + if (constant_condition_p (cond_bb)) return false; return true; @@ -364,14 +382,27 @@ recognize_bits_test (gcond *cond, tree *name, tree *bits, bool inv) } -/* Update profile after code in outer_cond_bb was adjusted so - outer_cond_bb has no condition. */ +/* Update profile after code in either outer_cond_bb or inner_cond_bb was + adjusted so that it has no condition. */ static void update_profile_after_ifcombine (basic_block inner_cond_bb, basic_block outer_cond_bb) { - edge outer_to_inner = find_edge (outer_cond_bb, inner_cond_bb); + /* In the following we assume that inner_cond_bb has single predecessor. */ + gcc_assert (single_pred_p (inner_cond_bb)); + + basic_block outer_to_inner_bb = inner_cond_bb; + profile_probability prob = profile_probability::always (); + for (basic_block parent = single_pred (outer_to_inner_bb); + parent != outer_cond_bb; + parent = single_pred (outer_to_inner_bb)) +{ + prob *= find_edge (parent, outer_to_inner_bb)->probability; + outer_to_inner_bb = parent; +} + + edge outer_to_inner = find_edge (outer_cond_bb, outer_to_inner_bb); edge outer2 = (EDGE_SUCC (outer_cond_bb, 0) == outer_to_inner ? EDGE_SUCC (outer_cond_bb, 1) : EDGE_SUCC (outer_cond_bb, 0)); @@ -382,29 +413,62 @@ update_profile_after_ifcombine (basic_block inner_cond_bb, std::swap (inner_taken, inner_not_taken); gcc_assert (inner_taken->dest == outer2->dest); - /* In the following we assume that inner_cond_bb has single predecessor. */ - gcc_assert (single_pred_p (inner_cond_bb)); + if (constant_condition_p (outer_cond_bb)) +{ + gcc_checking_assert (outer_to_inner_bb == inner_cond_bb); - /* Path outer_cond_bb->(outer2) needs to be merged into path - outer_cond_bb->(outer_to_inner)->inner_cond_bb->(inner_taken) - and probability of inner_not_taken updated. */ + /* Path outer_cond_bb->(outer2) needs to be merged into path +outer_cond_bb->(outer_to_inner)->inner_cond_bb->(inner_taken) +and probability of inner_not_taken updated. */ - inner_cond_bb->count = outer_cond_bb->count; + inner_cond_bb->count = outer_cond_bb->count; - /* Handle special case where inner_taken probability is always. In this case - we know that the overall outcome will be always as well, but combining - probabilities will be conservative because it does not know that - outer2->probability is inverse of outer_to_inner->probability. */ - if (inner_taken->probability == profile_probability::always ()) -; - else -inner_taken->probability = outer2->probability + outer_to_inner->probability - * inner_taken->probability; - inner_not_taken->probability = profile_probability::always () -- inner_taken->probability; + /* Handle special case where inner_taken probability is always. In this +case we know that the overall outcome will be always as well, but +combining probabilities
[gcc/aoliva/heads/testme] adjust probs after modified ifcombine
The branch 'aoliva/heads/testme' was updated to point to: 3a8bc917de46... adjust probs after modified ifcombine It previously pointed to: 04f96e7c8ce3... adjust probs after modified ifcombine Diff: !!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST): --- 04f96e7... adjust probs after modified ifcombine Summary of changes (added commits): --- 3a8bc91... adjust probs after modified ifcombine
[gcc(refs/users/aoliva/heads/testme)] adjust probs after modified ifcombine
https://gcc.gnu.org/g:3a8bc917de463960db6c0bcd12f5a0537e654c94 commit 3a8bc917de463960db6c0bcd12f5a0537e654c94 Author: Alexandre Oliva Date: Thu Sep 19 06:43:22 2024 -0300 adjust probs after modified ifcombine Diff: --- gcc/tree-ssa-ifcombine.cc | 244 +- 1 file changed, 178 insertions(+), 66 deletions(-) diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc index 79ccc70b2678..47fc64383384 100644 --- a/gcc/tree-ssa-ifcombine.cc +++ b/gcc/tree-ssa-ifcombine.cc @@ -49,6 +49,28 @@ along with GCC; see the file COPYING3. If not see false) >= 2) #endif +/* Return TRUE iff COND is NULL, or the condition in it is constant. */ + +static bool +constant_condition_p (gcond *cond) +{ + if (!cond) +return true; + + return (CONSTANT_CLASS_P (gimple_cond_lhs (cond)) + && CONSTANT_CLASS_P (gimple_cond_rhs (cond))); +} + +/* Return FALSE iff the condition in the COND stmt that ends COND_BB is not + constant. */ + +static bool +constant_condition_p (basic_block cond_bb) +{ + gcond *cond = safe_dyn_cast (*gsi_last_bb (cond_bb)); + return constant_condition_p (cond); +} + /* This pass combines COND_EXPRs to simplify control flow. It currently recognizes bit tests and comparisons in chains that represent logical and or logical or of two COND_EXPRs. @@ -107,12 +129,7 @@ recognize_if_then_else (basic_block cond_bb, if (!*else_bb) *else_bb = e->dest; - gcond *cond = safe_dyn_cast (*gsi_last_bb (cond_bb)); - if (!cond) -return false; - - if (CONSTANT_CLASS_P (gimple_cond_lhs (cond)) - && CONSTANT_CLASS_P (gimple_cond_rhs (cond))) + if (constant_condition_p (cond_bb)) return false; return true; @@ -364,14 +381,27 @@ recognize_bits_test (gcond *cond, tree *name, tree *bits, bool inv) } -/* Update profile after code in outer_cond_bb was adjusted so - outer_cond_bb has no condition. */ +/* Update profile after code in either outer_cond_bb or inner_cond_bb was + adjusted so that it has no condition. */ static void update_profile_after_ifcombine (basic_block inner_cond_bb, basic_block outer_cond_bb) { - edge outer_to_inner = find_edge (outer_cond_bb, inner_cond_bb); + /* In the following we assume that inner_cond_bb has single predecessor. */ + gcc_assert (single_pred_p (inner_cond_bb)); + + basic_block outer_to_inner_bb = inner_cond_bb; + profile_probability prob = profile_probability::always (); + for (basic_block parent = single_pred (outer_to_inner_bb); + parent != outer_cond_bb; + parent = single_pred (outer_to_inner_bb)) +{ + prob *= find_edge (parent, outer_to_inner_bb)->probability; + outer_to_inner_bb = parent; +} + + edge outer_to_inner = find_edge (outer_cond_bb, outer_to_inner_bb); edge outer2 = (EDGE_SUCC (outer_cond_bb, 0) == outer_to_inner ? EDGE_SUCC (outer_cond_bb, 1) : EDGE_SUCC (outer_cond_bb, 0)); @@ -382,29 +412,62 @@ update_profile_after_ifcombine (basic_block inner_cond_bb, std::swap (inner_taken, inner_not_taken); gcc_assert (inner_taken->dest == outer2->dest); - /* In the following we assume that inner_cond_bb has single predecessor. */ - gcc_assert (single_pred_p (inner_cond_bb)); + if (constant_condition_p (outer_cond_bb)) +{ + gcc_checking_assert (outer_to_inner_bb == inner_cond_bb); - /* Path outer_cond_bb->(outer2) needs to be merged into path - outer_cond_bb->(outer_to_inner)->inner_cond_bb->(inner_taken) - and probability of inner_not_taken updated. */ + /* Path outer_cond_bb->(outer2) needs to be merged into path +outer_cond_bb->(outer_to_inner)->inner_cond_bb->(inner_taken) +and probability of inner_not_taken updated. */ - inner_cond_bb->count = outer_cond_bb->count; + inner_cond_bb->count = outer_cond_bb->count; - /* Handle special case where inner_taken probability is always. In this case - we know that the overall outcome will be always as well, but combining - probabilities will be conservative because it does not know that - outer2->probability is inverse of outer_to_inner->probability. */ - if (inner_taken->probability == profile_probability::always ()) -; - else -inner_taken->probability = outer2->probability + outer_to_inner->probability - * inner_taken->probability; - inner_not_taken->probability = profile_probability::always () -- inner_taken->probability; + /* Handle special case where inner_taken probability is always. In this +case we know that the overall outcome will be always as well, but +combining probabilities will be conservative because it does not know +that outer2->probability is inverse of +outer_to_inner->probability. */ + if (inner_taken->probability == profile_probability::always ()) + ; + else +
[gcc(refs/users/aoliva/heads/testme)] adjust probs after modified ifcombine
https://gcc.gnu.org/g:04f96e7c8ce32b3a198fd8d3e9c5a929d3a713f1 commit 04f96e7c8ce32b3a198fd8d3e9c5a929d3a713f1 Author: Alexandre Oliva Date: Thu Sep 19 06:43:22 2024 -0300 adjust probs after modified ifcombine Diff: --- gcc/tree-ssa-ifcombine.cc | 248 +- 1 file changed, 180 insertions(+), 68 deletions(-) diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc index 79ccc70b2678..4c5f39d9b3e1 100644 --- a/gcc/tree-ssa-ifcombine.cc +++ b/gcc/tree-ssa-ifcombine.cc @@ -49,6 +49,28 @@ along with GCC; see the file COPYING3. If not see false) >= 2) #endif +/* Return TRUE iff COND is NULL, or the condition in it is constant. */ + +static bool +constant_condition_p (gcond *cond) +{ + if (!cond) +return true; + + return (CONSTANT_CLASS_P (gimple_cond_lhs (cond)) + && CONSTANT_CLASS_P (gimple_cond_rhs (cond))); +} + +/* Return FALSE iff the condition in the COND stmt that ends COND_BB is not + constant. */ + +static bool +constant_condition_p (basic_block cond_bb) +{ + gcond *cond = safe_dyn_cast (*gsi_last_bb (cond_bb)); + return constant_condition_p (cond); +} + /* This pass combines COND_EXPRs to simplify control flow. It currently recognizes bit tests and comparisons in chains that represent logical and or logical or of two COND_EXPRs. @@ -107,12 +129,7 @@ recognize_if_then_else (basic_block cond_bb, if (!*else_bb) *else_bb = e->dest; - gcond *cond = safe_dyn_cast (*gsi_last_bb (cond_bb)); - if (!cond) -return false; - - if (CONSTANT_CLASS_P (gimple_cond_lhs (cond)) - && CONSTANT_CLASS_P (gimple_cond_rhs (cond))) + if (constant_condition_p (cond_bb)) return false; return true; @@ -364,14 +381,27 @@ recognize_bits_test (gcond *cond, tree *name, tree *bits, bool inv) } -/* Update profile after code in outer_cond_bb was adjusted so - outer_cond_bb has no condition. */ +/* Update profile after code in either outer_cond_bb or inner_cond_bb was + adjusted so that it has no condition. */ static void update_profile_after_ifcombine (basic_block inner_cond_bb, basic_block outer_cond_bb) { - edge outer_to_inner = find_edge (outer_cond_bb, inner_cond_bb); + /* In the following we assume that inner_cond_bb has single predecessor. */ + gcc_assert (single_pred_p (inner_cond_bb)); + + basic_block outer_to_inner_bb = inner_cond_bb; + profile_probability prob = profile_probability::always (); + for (basic_block parent = single_pred (outer_to_inner_bb); + parent != outer_cond_bb; + parent = single_pred (outer_to_inner_bb)) +{ + prob *= find_edge (parent, outer_to_inner_bb)->probability; + outer_to_inner_bb = parent; +} + + edge outer_to_inner = find_edge (outer_cond_bb, outer_to_inner_bb); edge outer2 = (EDGE_SUCC (outer_cond_bb, 0) == outer_to_inner ? EDGE_SUCC (outer_cond_bb, 1) : EDGE_SUCC (outer_cond_bb, 0)); @@ -382,29 +412,62 @@ update_profile_after_ifcombine (basic_block inner_cond_bb, std::swap (inner_taken, inner_not_taken); gcc_assert (inner_taken->dest == outer2->dest); - /* In the following we assume that inner_cond_bb has single predecessor. */ - gcc_assert (single_pred_p (inner_cond_bb)); + if (constant_condition_p (outer_cond_bb)) +{ + gcc_checking_assert (outer_to_inner_bb == inner_cond_bb); - /* Path outer_cond_bb->(outer2) needs to be merged into path - outer_cond_bb->(outer_to_inner)->inner_cond_bb->(inner_taken) - and probability of inner_not_taken updated. */ + /* Path outer_cond_bb->(outer2) needs to be merged into path +outer_cond_bb->(outer_to_inner)->inner_cond_bb->(inner_taken) +and probability of inner_not_taken updated. */ - inner_cond_bb->count = outer_cond_bb->count; + inner_cond_bb->count = outer_cond_bb->count; - /* Handle special case where inner_taken probability is always. In this case - we know that the overall outcome will be always as well, but combining - probabilities will be conservative because it does not know that - outer2->probability is inverse of outer_to_inner->probability. */ - if (inner_taken->probability == profile_probability::always ()) -; - else -inner_taken->probability = outer2->probability + outer_to_inner->probability - * inner_taken->probability; - inner_not_taken->probability = profile_probability::always () -- inner_taken->probability; + /* Handle special case where inner_taken probability is always. In this +case we know that the overall outcome will be always as well, but +combining probabilities will be conservative because it does not know +that outer2->probability is inverse of +outer_to_inner->probability. */ + if (inner_taken->probability == profile_probability::always ()) + ; + else +
[gcc(refs/users/aoliva/heads/testme)] assorted improvements for fold_truth_andor_1
https://gcc.gnu.org/g:d675d492ee65967ca5dcae755a73f33c514ed8ca commit d675d492ee65967ca5dcae755a73f33c514ed8ca Author: Alexandre Oliva Date: Tue Sep 17 20:15:13 2024 -0300 assorted improvements for fold_truth_andor_1 This patch introduces various improvements to the logic that merges field compares. Before the patch, we could merge: (a.x1 EQNE b.x1) ANDOR (a.y1 EQNE b.y1) into something like: (((type *)&a)[Na] & MASK) EQNE (((type *)&b)[Nb] & MASK) if both of A's fields live within the same alignment boundaries, and so do B's, at the same relative positions. Constants may be used instead of the object B. The initial goal of this patch was to enable such combinations when a field crossed alignment boundaries, e.g. for packed types. We can't generally access such fields with a single memory access, so when we come across such a compare, we will attempt to combine each access separately. Some merging opportunities were missed because of right-shifts, compares expressed as e.g. ((a.x1 ^ b.x1) & MASK) EQNE 0, and narrowing conversions, especially after earlier merges. This patch introduces handlers for several cases involving these. Other merging opportunities were missed because of association. The existing logic would only succeed in merging a pair of consecutive compares, or e.g. B with C in (A ANDOR B) ANDOR C, not even trying e.g. C and D in (A ANDOR (B ANDOR C)) ANDOR D. I've generalized the handling of the rightmost compare in the left-hand operand, going for the leftmost compare in the right-hand operand, and then onto trying to merge compares pairwise, one from each operand, even if they are not consecutive, taking care to avoid merging operations with intervening side effects, including volatile accesses. When it is the second of a non-consecutive pair of compares that first accesses a word, we may merge the first compare with part of the second compare that refers to the same word, keeping the compare of the remaining bits at the spot where the second compare used to be. Handling compares with non-constant fields was somewhat generalized, now handling non-adjacent fields. When a field of one object crosses an alignment boundary but the other doesn't, we issue the same load in both compares; gimple optimizers will later turn it into a single load, without our having to handle SAVE_EXPRs at this point. The logic for issuing split loads and compares, and ordering them, is now shared between all cases of compares with constants and with another object. The -Wno-error for toplev.o on rs6000 is because of toplev.c's: if ((flag_sanitize & SANITIZE_ADDRESS) && !FRAME_GROWS_DOWNWARD) and rs6000.h's: #define FRAME_GROWS_DOWNWARD (flag_stack_protect != 0 \ || (flag_sanitize & SANITIZE_ADDRESS) != 0) The mutually exclusive conditions involving flag_sanitize are now noticed and reported by fold-const.c's: warning (0, "% of mutually exclusive equal-tests" " is always 0"); This patch enables over 12k compare-merging opportunities that we used to miss in a GCC bootstrap. for gcc/ChangeLog * fold-const.cc (prepare_xor): New. (decode_field_reference): Handle xor, shift, and narrowing conversions. (all_ones_mask_p): Remove. (compute_split_boundary_from_align): New. (build_split_load, reuse_split_load): New. (fold_truth_andor_1): Add recursion to combine pairs of non-neighboring compares. Handle xor compared with zero. Handle fields straddling across alignment boundaries. Generalize handling of non-constant rhs. (fold_truth_andor): Leave sub-expression handling to the recursion above. * config/rs6000/t-rs6000 (toplev.o-warn): Disable errors. for gcc/testsuite/ChangeLog * gcc.dg/field-merge-1.c: New. * gcc.dg/field-merge-2.c: New. * gcc.dg/field-merge-3.c: New. * gcc.dg/field-merge-4.c: New. * gcc.dg/field-merge-5.c: New. Diff: --- gcc/config/rs6000/t-rs6000 | 4 + gcc/fold-const.cc| 818 --- gcc/testsuite/gcc.dg/field-merge-1.c | 64 +++ gcc/testsuite/gcc.dg/field-merge-2.c | 31 ++ gcc/testsuite/gcc.dg/field-merge-3.c | 36 ++ gcc/testsuite/gcc.dg/field-merge-4.c | 40 ++ gcc/testsuite/gcc.dg/field-merge-5.c | 40 ++ 7 files changed, 881 insertions(+), 152 deletions(-) diff --git a/gcc/config/rs6000/t-rs6000 b/gcc/config/rs6000/t-rs6000 index 155788de40a3..a83968d663a6
[gcc(refs/users/aoliva/heads/testme)] rework truth_andor folding into tree-ssa-ifcombine
https://gcc.gnu.org/g:d041471d649c47763535d673ad689654d3630223 commit d041471d649c47763535d673ad689654d3630223 Author: Alexandre Oliva Date: Tue Sep 17 20:15:22 2024 -0300 rework truth_andor folding into tree-ssa-ifcombine Diff: --- gcc/fold-const.cc | 1048 + gcc/gimple-fold.cc| 1149 + gcc/tree-ssa-ifcombine.cc |7 +- 3 files changed, 1170 insertions(+), 1034 deletions(-) diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc index 6dbb9208dc29..552a706ab6de 100644 --- a/gcc/fold-const.cc +++ b/gcc/fold-const.cc @@ -137,7 +137,6 @@ static tree range_successor (tree); static tree fold_range_test (location_t, enum tree_code, tree, tree, tree); static tree fold_cond_expr_with_comparison (location_t, tree, enum tree_code, tree, tree, tree, tree); -static tree unextend (tree, int, int, tree); static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *); static tree extract_muldiv_1 (tree, tree, enum tree_code, tree, bool *); static tree fold_binary_op_with_conditional_arg (location_t, @@ -4701,7 +4700,7 @@ invert_truthvalue_loc (location_t loc, tree arg) is the original memory reference used to preserve the alias set of the access. */ -static tree +tree make_bit_field_ref (location_t loc, tree inner, tree orig_inner, tree type, HOST_WIDE_INT bitsize, poly_int64 bitpos, int unsignedp, int reversep) @@ -4951,212 +4950,6 @@ optimize_bit_field_compare (location_t loc, enum tree_code code, return lhs; } -/* If *R_ARG is a constant zero, and L_ARG is a possibly masked - BIT_XOR_EXPR, return 1 and set *r_arg to l_arg. - Otherwise, return 0. - - The returned value should be passed to decode_field_reference for it - to handle l_arg, and then doubled for r_arg. */ -static int -prepare_xor (tree l_arg, tree *r_arg) -{ - int ret = 0; - - if (!integer_zerop (*r_arg)) -return ret; - - tree exp = l_arg; - STRIP_NOPS (exp); - - if (TREE_CODE (exp) == BIT_AND_EXPR) -{ - tree and_mask = TREE_OPERAND (exp, 1); - exp = TREE_OPERAND (exp, 0); - STRIP_NOPS (exp); STRIP_NOPS (and_mask); - if (TREE_CODE (and_mask) != INTEGER_CST) - return ret; -} - - if (TREE_CODE (exp) == BIT_XOR_EXPR) -{ - *r_arg = l_arg; - return 1; -} - - return ret; -} - -/* Subroutine for fold_truth_andor_1: decode a field reference. - - If EXP is a comparison reference, we return the innermost reference. - - *PBITSIZE is set to the number of bits in the reference, *PBITPOS is - set to the starting bit number. - - If the innermost field can be completely contained in a mode-sized - unit, *PMODE is set to that mode. Otherwise, it is set to VOIDmode. - - *PVOLATILEP is set to 1 if the any expression encountered is volatile; - otherwise it is not changed. - - *PUNSIGNEDP is set to the signedness of the field. - - *PREVERSEP is set to the storage order of the field. - - *PMASK is set to the mask used. This is either contained in a - BIT_AND_EXPR or derived from the width of the field. - - *PAND_MASK is set to the mask found in a BIT_AND_EXPR, if any. - - XOR_WHICH is 1 or 2 if EXP was found to be a (possibly masked) - BIT_XOR_EXPR compared with zero. We're to take the first or second - operand thereof if so. It should be zero otherwise. - - Return 0 if this is not a component reference or is one that we can't - do anything with. */ - -static tree -decode_field_reference (location_t loc, tree *exp_, HOST_WIDE_INT *pbitsize, - HOST_WIDE_INT *pbitpos, machine_mode *pmode, - int *punsignedp, int *preversep, int *pvolatilep, - tree *pmask, tree *pand_mask, int xor_which) -{ - tree exp = *exp_; - tree outer_type = 0; - tree and_mask = 0; - tree mask, inner, offset; - tree unsigned_type; - unsigned int precision; - HOST_WIDE_INT shiftrt = 0; - - /* All the optimizations using this function assume integer fields. - There are problems with FP fields since the type_for_size call - below can fail for, e.g., XFmode. */ - if (! INTEGRAL_TYPE_P (TREE_TYPE (exp))) -return NULL_TREE; - - /* We are interested in the bare arrangement of bits, so strip everything - that doesn't affect the machine mode. However, record the type of the - outermost expression if it may matter below. */ - if (CONVERT_EXPR_P (exp) - || TREE_CODE (exp) == NON_LVALUE_EXPR) -outer_type = TREE_TYPE (exp); - STRIP_NOPS (exp); - - if (TREE_CODE (exp) == BIT_AND_EXPR) -{ - and_mask = TREE_OPERAND (exp, 1); - exp = TREE_OPERAND (exp, 0); - STRIP_NOPS (exp); STRIP_NOPS (and_mask); - if (TREE_CODE (and_mask) != INTEGER_CST) - return NULL_TREE; -} - - if (xor_which) -{ - gcc_checking_assert (TREE_CODE (exp) == BIT_XOR_EXPR); -
[gcc(refs/users/aoliva/heads/testme)] fold truth-and only in ifcombine
https://gcc.gnu.org/g:15a55a94711d51d95fb6b5ba763903d75e85324e commit 15a55a94711d51d95fb6b5ba763903d75e85324e Author: Alexandre Oliva Date: Tue Sep 17 20:15:35 2024 -0300 fold truth-and only in ifcombine Diff: --- gcc/gimple-fold.cc| 2 ++ gcc/tree-ssa-ifcombine.cc | 24 +--- 2 files changed, 23 insertions(+), 3 deletions(-) diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc index 85a0ec028030..5b7d83edbea9 100644 --- a/gcc/gimple-fold.cc +++ b/gcc/gimple-fold.cc @@ -8738,12 +8738,14 @@ maybe_fold_and_comparisons (tree type, op2b, outer_cond_bb)) return t; +#if 0 if (tree t = fold_truth_andor_maybe_separate (UNKNOWN_LOCATION, TRUTH_ANDIF_EXPR, type, code2, op2a, op2b, code1, op1a, op1b, NULL)) return t; +#endif return NULL_TREE; } diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc index 79a4bdd363b9..61480e5fa894 100644 --- a/gcc/tree-ssa-ifcombine.cc +++ b/gcc/tree-ssa-ifcombine.cc @@ -399,6 +399,14 @@ update_profile_after_ifcombine (basic_block inner_cond_bb, outer2->probability = profile_probability::never (); } +/* FIXME: move to a header file. */ +extern tree +fold_truth_andor_maybe_separate (location_t loc, +enum tree_code code, tree truth_type, +enum tree_code lcode, tree ll_arg, tree lr_arg, +enum tree_code rcode, tree rl_arg, tree rr_arg, +tree *separatep); + /* If-convert on a and pattern with a common else block. The inner if is specified by its INNER_COND_BB, the outer by OUTER_COND_BB. inner_inv, outer_inv and result_inv indicate whether the conditions @@ -576,7 +584,7 @@ ifcombine_ifandif (basic_block inner_cond_bb, bool inner_inv, else if (TREE_CODE_CLASS (gimple_cond_code (inner_cond)) == tcc_comparison && TREE_CODE_CLASS (gimple_cond_code (outer_cond)) == tcc_comparison) { - tree t; + tree t, ts = NULL_TREE; enum tree_code inner_cond_code = gimple_cond_code (inner_cond); enum tree_code outer_cond_code = gimple_cond_code (outer_cond); @@ -599,7 +607,17 @@ ifcombine_ifandif (basic_block inner_cond_bb, bool inner_inv, outer_cond_code, gimple_cond_lhs (outer_cond), gimple_cond_rhs (outer_cond), - gimple_bb (outer_cond + gimple_bb (outer_cond))) + && !(t = ts = (fold_truth_andor_maybe_separate +(UNKNOWN_LOCATION, TRUTH_ANDIF_EXPR, + boolean_type_node, + outer_cond_code, + gimple_cond_lhs (outer_cond), + gimple_cond_rhs (outer_cond), + inner_cond_code, + gimple_cond_lhs (inner_cond), + gimple_cond_rhs (inner_cond), + NULL { { tree t1, t2; @@ -636,7 +654,7 @@ ifcombine_ifandif (basic_block inner_cond_bb, bool inner_inv, NULL, true, GSI_SAME_STMT); } /* ??? Fold should avoid this. */ - else if (!is_gimple_condexpr_for_cond (t)) + else if (ts && !is_gimple_condexpr_for_cond (t)) goto gimplify_after_fold; if (result_inv) t = fold_build1 (TRUTH_NOT_EXPR, TREE_TYPE (t), t);
[gcc(refs/users/aoliva/heads/testme)] support noncontiguous ifcombine
https://gcc.gnu.org/g:a29037a8f9c752e41a906f0eac66ff3792e98bcc commit a29037a8f9c752e41a906f0eac66ff3792e98bcc Author: Alexandre Oliva Date: Tue Sep 17 20:15:55 2024 -0300 support noncontiguous ifcombine Diff: --- gcc/tree-ssa-ifcombine.cc | 33 ++--- 1 file changed, 26 insertions(+), 7 deletions(-) diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc index 3d57c615d827..79ccc70b2678 100644 --- a/gcc/tree-ssa-ifcombine.cc +++ b/gcc/tree-ssa-ifcombine.cc @@ -779,13 +779,13 @@ tree_ssa_ifcombine_bb_1 (basic_block inner_cond_bb, basic_block outer_cond_bb, if-conversion helper. We start with BB as the innermost worker basic-block. Returns true if a transformation was done. */ -static bool +static basic_block tree_ssa_ifcombine_bb (basic_block inner_cond_bb) { basic_block then_bb = NULL, else_bb = NULL; if (!recognize_if_then_else (inner_cond_bb, &then_bb, &else_bb)) -return false; +return NULL; /* Recognize && and || of two conditions with a common then/else block which entry edges we can merge. That is: @@ -802,7 +802,7 @@ tree_ssa_ifcombine_bb (basic_block inner_cond_bb) if (tree_ssa_ifcombine_bb_1 (inner_cond_bb, outer_cond_bb, then_bb, else_bb, inner_cond_bb)) - return true; + return bb; if (forwarder_block_to (else_bb, then_bb)) { @@ -814,7 +814,7 @@ tree_ssa_ifcombine_bb (basic_block inner_cond_bb) edge from outer_cond_bb and the forwarder block. */ if (tree_ssa_ifcombine_bb_1 (inner_cond_bb, outer_cond_bb, else_bb, then_bb, else_bb)) - return true; + return bb; } else if (forwarder_block_to (then_bb, else_bb)) { @@ -826,11 +826,11 @@ tree_ssa_ifcombine_bb (basic_block inner_cond_bb) edge from outer_cond_bb and the forwarder block. */ if (tree_ssa_ifcombine_bb_1 (inner_cond_bb, outer_cond_bb, else_bb, then_bb, then_bb)) - return true; + return bb; } } - return false; + return NULL; } /* Main entry for the tree if-conversion pass. */ @@ -881,12 +881,14 @@ pass_tree_ifcombine::execute (function *fun) inner ones, and also that we do not try to visit a removed block. This is opposite of PHI-OPT, because we cascade the combining rather than cascading PHIs. */ + basic_block seen = NULL; + bool changed = false; for (i = n_basic_blocks_for_fn (fun) - NUM_FIXED_BLOCKS - 1; i >= 0; i--) { basic_block bb = bbs[i]; if (safe_is_a (*gsi_last_bb (bb))) - if (tree_ssa_ifcombine_bb (bb)) + if (basic_block outer_bb = tree_ssa_ifcombine_bb (bb)) { /* Clear range info from all stmts in BB which is now executed conditional on a always true/false condition. */ @@ -905,7 +907,24 @@ pass_tree_ifcombine::execute (function *fun) rewrite_to_defined_overflow (&gsi); } cfg_changed |= true; + if (seen) + changed |= true; + else + seen = bb; + /* Go back and check whether the modified outer_bb can be further + optimized. ??? How could it? */ + do + i++; + while (bbs[i] != outer_bb); + continue; } + + if (bb == seen) + { + gcc_assert (!changed); + seen = NULL; + changed = false; + } } free (bbs);
[gcc(refs/users/aoliva/heads/testme)] refactor ifcombine
https://gcc.gnu.org/g:3ed1ed8f0533f3f3f4372a2280c4e1c29304cd78 commit 3ed1ed8f0533f3f3f4372a2280c4e1c29304cd78 Author: Alexandre Oliva Date: Thu Sep 19 02:43:51 2024 -0300 refactor ifcombine Diff: --- gcc/tree-ssa-ifcombine.cc | 181 +++--- 1 file changed, 89 insertions(+), 92 deletions(-) diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc index eb4317bebdfb..3d57c615d827 100644 --- a/gcc/tree-ssa-ifcombine.cc +++ b/gcc/tree-ssa-ifcombine.cc @@ -107,6 +107,14 @@ recognize_if_then_else (basic_block cond_bb, if (!*else_bb) *else_bb = e->dest; + gcond *cond = safe_dyn_cast (*gsi_last_bb (cond_bb)); + if (!cond) +return false; + + if (CONSTANT_CLASS_P (gimple_cond_lhs (cond)) + && CONSTANT_CLASS_P (gimple_cond_rhs (cond))) +return false; + return true; } @@ -407,15 +415,67 @@ fold_truth_andor_maybe_separate (location_t loc, enum tree_code rcode, tree rl_arg, tree rr_arg, tree *separatep); +/* Replace the conditions in INNER_COND and OUTER_COND with COND and COND2. + COND and COND2 are computed for insertion at INNER_COND, with OUTER_COND + replaced with a constant, but if there are intervening blocks, it's best to + adjust COND for insertion at OUTER_COND, placing COND2 at INNER_COND. */ + +static tree +ifcombine_replace_cond (gcond *inner_cond, bool inner_inv, + gcond *outer_cond, bool outer_inv, + tree cond, bool must_canon, + tree cond2) +{ + tree t = cond; + bool result_inv = inner_inv; + + /* ??? Support intervening blocks. */ + if (single_pred (gimple_bb (inner_cond)) != gimple_bb (outer_cond)) +return NULL_TREE; + + /* ??? Use both conditions. */ + if (cond2) +t = fold_build2 (TRUTH_AND_EXPR, TREE_TYPE (t), cond, cond2); + + /* ??? Insert at outer_cond. */ + if (result_inv) +t = fold_build1 (TRUTH_NOT_EXPR, TREE_TYPE (t), t); + tree ret = t; + + if (tree tcanon = canonicalize_cond_expr_cond (t)) +ret = t = tcanon; + else if (must_canon) +return NULL_TREE; + if (!is_gimple_condexpr_for_cond (t)) +{ + gimple_stmt_iterator gsi = gsi_for_stmt (inner_cond); + t = force_gimple_operand_gsi_1 (&gsi, t, is_gimple_condexpr_for_cond, + NULL, true, GSI_SAME_STMT); +} + gimple_cond_set_condition_from_tree (inner_cond, t); + update_stmt (inner_cond); + + /* Leave CFG optimization to cfg_cleanup. */ + gimple_cond_set_condition_from_tree (outer_cond, + outer_inv + ? boolean_false_node + : boolean_true_node); + update_stmt (outer_cond); + + update_profile_after_ifcombine (gimple_bb (inner_cond), + gimple_bb (outer_cond)); + + return ret; +} + /* If-convert on a and pattern with a common else block. The inner if is specified by its INNER_COND_BB, the outer by OUTER_COND_BB. - inner_inv, outer_inv and result_inv indicate whether the conditions - are inverted. + inner_inv, outer_inv indicate whether the conditions are inverted. Returns true if the edges to the common else basic-block were merged. */ static bool ifcombine_ifandif (basic_block inner_cond_bb, bool inner_inv, - basic_block outer_cond_bb, bool outer_inv, bool result_inv) + basic_block outer_cond_bb, bool outer_inv) { gimple_stmt_iterator gsi; tree name1, name2, bit1, bit2, bits1, bits2; @@ -454,26 +514,13 @@ ifcombine_ifandif (basic_block inner_cond_bb, bool inner_inv, t2 = fold_build2 (BIT_AND_EXPR, TREE_TYPE (name1), name1, t); t2 = force_gimple_operand_gsi (&gsi, t2, true, NULL_TREE, true, GSI_SAME_STMT); - t = fold_build2 (result_inv ? NE_EXPR : EQ_EXPR, - boolean_type_node, t2, t); - t = canonicalize_cond_expr_cond (t); - if (!t) - return false; - if (!is_gimple_condexpr_for_cond (t)) - { - gsi = gsi_for_stmt (inner_cond); - t = force_gimple_operand_gsi_1 (&gsi, t, is_gimple_condexpr_for_cond, - NULL, true, GSI_SAME_STMT); - } - gimple_cond_set_condition_from_tree (inner_cond, t); - update_stmt (inner_cond); - /* Leave CFG optimization to cfg_cleanup. */ - gimple_cond_set_condition_from_tree (outer_cond, - outer_inv ? boolean_false_node : boolean_true_node); - update_stmt (outer_cond); + t = fold_build2 (EQ_EXPR, boolean_type_node, t2, t); - update_profile_after_ifcombine (inner_cond_bb, outer_cond_bb); + if (!ifcombine_replace_cond (inner_cond, inner_inv, + outer_cond, outer_inv, + t, true, NULL_TREE)) + return false; if (dump_f
[gcc(refs/users/aoliva/heads/testme)] support noncontiguous ifcombine
https://gcc.gnu.org/g:b0b68cbc1ed13ee0c61e0e2d768d997e8a1dfaa8 commit b0b68cbc1ed13ee0c61e0e2d768d997e8a1dfaa8 Author: Alexandre Oliva Date: Tue Sep 17 20:15:50 2024 -0300 support noncontiguous ifcombine Diff: --- gcc/tree-ssa-ifcombine.cc | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc index 7678c87e0170..eb4317bebdfb 100644 --- a/gcc/tree-ssa-ifcombine.cc +++ b/gcc/tree-ssa-ifcombine.cc @@ -798,10 +798,10 @@ tree_ssa_ifcombine_bb (basic_block inner_cond_bb) if (a && b) ; This requires a single predecessor of the inner cond_bb. */ - if (single_pred_p (inner_cond_bb) - && bb_no_side_effects_p (inner_cond_bb)) + for (basic_block bb = inner_cond_bb; + single_pred_p (bb) && bb_no_side_effects_p (bb); ) { - basic_block outer_cond_bb = single_pred (inner_cond_bb); + basic_block outer_cond_bb = bb = single_pred (bb); if (tree_ssa_ifcombine_bb_1 (inner_cond_bb, outer_cond_bb, then_bb, else_bb, inner_cond_bb))
[gcc(refs/users/aoliva/heads/testme)] relax ifcombine to accept vuses
https://gcc.gnu.org/g:575a4da1213668119e0e60326a7b18f7c1a342d6 commit 575a4da1213668119e0e60326a7b18f7c1a342d6 Author: Alexandre Oliva Date: Tue Sep 17 20:15:46 2024 -0300 relax ifcombine to accept vuses Diff: --- gcc/config/i386/t-i386 | 2 ++ gcc/testsuite/gcc.dg/field-merge-6.c | 26 ++ gcc/tree-ssa-ifcombine.cc| 2 +- 3 files changed, 29 insertions(+), 1 deletion(-) diff --git a/gcc/config/i386/t-i386 b/gcc/config/i386/t-i386 index bf4ae109af98..1b904787ec62 100644 --- a/gcc/config/i386/t-i386 +++ b/gcc/config/i386/t-i386 @@ -79,3 +79,5 @@ s-i386-bt: $(srcdir)/config/i386/i386-builtin-types.awk \ $(AWK) -f $^ > tmp-bt.inc $(SHELL) $(srcdir)/../move-if-change tmp-bt.inc i386-builtin-types.inc $(STAMP) $@ + +insn-attrtab.o-warn = -Wno-error diff --git a/gcc/testsuite/gcc.dg/field-merge-6.c b/gcc/testsuite/gcc.dg/field-merge-6.c new file mode 100644 index ..7fd48a138d14 --- /dev/null +++ b/gcc/testsuite/gcc.dg/field-merge-6.c @@ -0,0 +1,26 @@ +/* { dg-do run } */ +/* { dg-options "-O" } */ +/* { dg-shouldfail } */ + +/* Check that the third compare won't be pulled ahead of the second one and + prevent, which would prevent the NULL pointer dereference that should cause + the execution to fail. */ + +struct s { + char a, b; + int *p; +}; + +struct s a = { 0, 1, 0 }; +struct s b = { 0, 0, 0 }; + +int f () { + return (a.a != b.a + || *b.p != *a.p + || a.b != b.b); +} + +int main() { + f (); + return 0; +} diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc index 61480e5fa894..7678c87e0170 100644 --- a/gcc/tree-ssa-ifcombine.cc +++ b/gcc/tree-ssa-ifcombine.cc @@ -129,7 +129,7 @@ bb_no_side_effects_p (basic_block bb) enum tree_code rhs_code; if (gimple_has_side_effects (stmt) || gimple_could_trap_p (stmt) - || gimple_vuse (stmt) + /* || gimple_vuse (stmt) */ /* We need to rewrite stmts with undefined overflow to use unsigned arithmetic but cannot do so for signed division. */ || ((ass = dyn_cast (stmt))
[gcc(refs/users/aoliva/heads/testme)] check for mergeable loads, choose insertion points accordingly
https://gcc.gnu.org/g:6ce741d00f03f73e1fb3e797e85707aef9cfd832 commit 6ce741d00f03f73e1fb3e797e85707aef9cfd832 Author: Alexandre Oliva Date: Tue Sep 17 20:15:28 2024 -0300 check for mergeable loads, choose insertion points accordingly Diff: --- gcc/gimple-fold.cc | 253 ++--- 1 file changed, 219 insertions(+), 34 deletions(-) diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc index 64426bd76977..85a0ec028030 100644 --- a/gcc/gimple-fold.cc +++ b/gcc/gimple-fold.cc @@ -69,6 +69,7 @@ along with GCC; see the file COPYING3. If not see #include "varasm.h" #include "internal-fn.h" #include "gimple-range.h" +#include "tree-ssa-loop-niter.h" // stmt_dominates_stmt_p /* ??? Move this to some header, it's defined in fold-const.c. */ extern tree @@ -7395,7 +7396,7 @@ maybe_fold_comparisons_from_match_pd (tree type, enum tree_code code, Same as ssa_is_replaceable_p, except that we don't insist it has a single use. */ -bool +static bool ssa_is_substitutable_p (gimple *stmt) { #if 0 @@ -7476,9 +7477,10 @@ is_cast_p (tree *name) if (gimple_num_ops (def) != 2) break; - if (get_gimple_rhs_class (gimple_expr_code (def)) - == GIMPLE_SINGLE_RHS) + if (gimple_assign_single_p (def)) { + if (gimple_assign_load_p (def)) + break; *name = gimple_assign_rhs1 (def); continue; } @@ -7515,8 +7517,7 @@ is_binop_p (enum tree_code code, tree *name) return 0; case 2: - if (get_gimple_rhs_class (gimple_expr_code (def)) - == GIMPLE_SINGLE_RHS) + if (gimple_assign_single_p (def) && !gimple_assign_load_p (def)) { *name = gimple_assign_rhs1 (def); continue; @@ -7524,7 +7525,7 @@ is_binop_p (enum tree_code code, tree *name) return 0; case 3: - ; + break; } if (gimple_assign_rhs_code (def) != code) @@ -7569,6 +7570,26 @@ prepare_xor (tree l_arg, tree *r_arg) return ret; } +/* If EXP is a SSA_NAME whose DEF is a load stmt, set *LOAD to it and + return its RHS, otherwise return EXP. */ + +static tree +follow_load (tree exp, gimple **load) +{ + if (TREE_CODE (exp) == SSA_NAME + && !SSA_NAME_IS_DEFAULT_DEF (exp)) +{ + gimple *def = SSA_NAME_DEF_STMT (exp); + if (gimple_assign_load_p (def)) + { + *load = def; + exp = gimple_assign_rhs1 (def); + } +} + + return exp; +} + /* Subroutine for fold_truth_andor_1: decode a field reference. If EXP is a comparison reference, we return the innermost reference. @@ -7595,6 +7616,9 @@ prepare_xor (tree l_arg, tree *r_arg) BIT_XOR_EXPR compared with zero. We're to take the first or second operand thereof if so. It should be zero otherwise. + *LOAD is set to the load stmt of the innermost reference, if any, + *and NULL otherwise. + Return 0 if this is not a component reference or is one that we can't do anything with. */ @@ -7602,7 +7626,8 @@ static tree decode_field_reference (location_t loc, tree *exp_, HOST_WIDE_INT *pbitsize, HOST_WIDE_INT *pbitpos, machine_mode *pmode, int *punsignedp, int *preversep, int *pvolatilep, - tree *pmask, tree *pand_mask, int xor_which) + tree *pmask, tree *pand_mask, int xor_which, + gimple **load) { tree exp = *exp_; tree outer_type = 0; @@ -7612,11 +7637,13 @@ decode_field_reference (location_t loc, tree *exp_, HOST_WIDE_INT *pbitsize, unsigned int precision; HOST_WIDE_INT shiftrt = 0; + *load = NULL; + /* All the optimizations using this function assume integer fields. There are problems with FP fields since the type_for_size call below can fail for, e.g., XFmode. */ if (! INTEGRAL_TYPE_P (TREE_TYPE (exp))) -return 0; +return NULL_TREE; /* We are interested in the bare arrangement of bits, so strip everything that doesn't affect the machine mode. However, record the type of the @@ -7626,7 +7653,7 @@ decode_field_reference (location_t loc, tree *exp_, HOST_WIDE_INT *pbitsize, if ((and_mask = is_binop_p (BIT_AND_EXPR, &exp))) { if (TREE_CODE (and_mask) != INTEGER_CST) - return 0; + return NULL_TREE; } if (xor_which) @@ -7644,16 +7671,18 @@ decode_field_reference (location_t loc, tree *exp_, HOST_WIDE_INT *pbitsize, if (tree shift = is_binop_p (RSHIFT_EXPR, &exp)) { if (TREE_CODE (shift) != INTEGER_CST || !tree_fits_shwi_p (shift)) - return 0; + return NULL_TREE; shiftrt = tree_to_shwi (shift); if (shiftrt <= 0) - return 0; + return NULL_TREE; } if (tree t = is_cast_p (&exp)) if (!outer_type) outer_type = t; + exp = follow_load (exp, load); + poly_int64 poly_bitsize, poly_bitpos; inner = ge
[gcc/aoliva/heads/testme] (46 commits) support noncontiguous ifcombine
The branch 'aoliva/heads/testme' was updated to point to: a29037a8f9c7... support noncontiguous ifcombine It previously pointed to: 8a7e9581280c... support noncontiguous ifcombine Diff: !!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST): --- 8a7e958... support noncontiguous ifcombine 4f6753d... support noncontiguous ifcombine 7b7dfff... relax ifcombine to accept vuses e731ae8... fold truth-and only in ifcombine fbf1f80... check for mergeable loads, choose insertion points accordin b4b872b... rework truth_andor folding into tree-ssa-ifcombine 8aa412b... assorted improvements for fold_truth_andor_1 Summary of changes (added commits): --- a29037a... support noncontiguous ifcombine 3ed1ed8... refactor ifcombine b0b68cb... support noncontiguous ifcombine 575a4da... relax ifcombine to accept vuses 15a55a9... fold truth-and only in ifcombine 6ce741d... check for mergeable loads, choose insertion points accordin d041471... rework truth_andor folding into tree-ssa-ifcombine d675d49... assorted improvements for fold_truth_andor_1 d6d8445... c++: fix constexpr cast from void* diag issue [PR116741] (*) 7ca4868... c++: ICE with -Wtautological-compare in template [PR116534] (*) dfe0d43... c++: crash with anon VAR_DECL [PR116676] (*) e311dd1... SVE intrinsics: Fold svdiv with all-zero operands to zero v (*) 008f451... Daily bump. (*) a92f54f... aarch64: Improve vector constant generation using SVE INDEX (*) 58bc39c... modula2: gcc/m2/Make-lang.in fix includes during bootstrap (*) f544838... AVR: Update weblinks to AVR-LibC. (*) 4af196b... aarch64: Emit ADD X, Y, Y instead of SHL X, Y, #1 for SVE i (*) f6e629a... PR modula2/116181 Use GCC tree location_t and separate poin (*) 7fb1117... AVR: Tweak >= and < compares with consts that are 0 mod 256 (*) 952df9c... riscv: Fix duplicate assmbler label in @tlsdesc insn (*) eb67e23... libstdc++: Add .editorconfig files (*) 48a0f69... vect: Set pattern_stmt_p on the newly created stmt_vec_info (*) 8d402c3... AVR: Tidy up enum and struct tags. (*) 9f8e182... AVR: Partially revert r15-3623. (*) 719edcb... libstdc++: Update link to installation docs (*) 4f2cd25... Daily bump. (*) d204bee... fortran: Remove useless nested end of scalarization chain h (*) a9f9391... c++: __extension__ and -Wconditionally-supported (*) 5ef73ba... c++: conversion location (*) 2af87d9... libstdc++: Adjust std::span::iterator to be ADL-proof (*) 1dde83f... libstdc++: Enable most of for freestanding (*) f91fe35... libstdc++: Add assertion for valid facet type arguments (*) c5fd1a4... libstdc++: Make PSTL algorithms accept C++20 iterators [PR1 (*) 368ba7a... c++, coroutines: Fix handling of bool await_suspend() [PR11 (*) 6e4244e... phi-opt: Improve heuristics for factoring out with constant (*) 0b31335... vect: release defs of removed statement (*) d2f10fc... Mark the copy/move constructor/operator= of auto_bitmap as (*) e07fbc9... Daily bump. (*) 1dd6dd1... testsuite; Fix execute/pr52286.c for 16bit (*) 8b5e547... c++: avoid init_priority warning in system header (*) 005f717... c++: Don't mix timevar_start and auto_cond_timevar for TV_N (*) a900349... AVR: Use rtx code copysign. (*) 99b8be4... libstdc++: Tweak localized formatting for floating-point ty (*) 01670a4... libstdc++: Refactor loops in std::__platform_semaphore (*) 49cb715... testsuite: adjust pragma-diag-17.c diagnostics (*) bec1f2c... c++: Fix g++.dg/ext/sve-sizeless-1.C regression (*) (*) This commit already exists in another branch. Because the reference `refs/users/aoliva/heads/testme' matches your hooks.email-new-commits-only configuration, no separate email is sent for this commit.
[gcc/aoliva/heads/testbase] (38 commits) c++: fix constexpr cast from void* diag issue [PR116741]
The branch 'aoliva/heads/testbase' was updated to point to: d6d8445c8550... c++: fix constexpr cast from void* diag issue [PR116741] It previously pointed to: b56bd542942b... testsuite: a few more hostedlib adjustments Diff: Summary of changes (added commits): --- d6d8445... c++: fix constexpr cast from void* diag issue [PR116741] (*) 7ca4868... c++: ICE with -Wtautological-compare in template [PR116534] (*) dfe0d43... c++: crash with anon VAR_DECL [PR116676] (*) e311dd1... SVE intrinsics: Fold svdiv with all-zero operands to zero v (*) 008f451... Daily bump. (*) a92f54f... aarch64: Improve vector constant generation using SVE INDEX (*) 58bc39c... modula2: gcc/m2/Make-lang.in fix includes during bootstrap (*) f544838... AVR: Update weblinks to AVR-LibC. (*) 4af196b... aarch64: Emit ADD X, Y, Y instead of SHL X, Y, #1 for SVE i (*) f6e629a... PR modula2/116181 Use GCC tree location_t and separate poin (*) 7fb1117... AVR: Tweak >= and < compares with consts that are 0 mod 256 (*) 952df9c... riscv: Fix duplicate assmbler label in @tlsdesc insn (*) eb67e23... libstdc++: Add .editorconfig files (*) 48a0f69... vect: Set pattern_stmt_p on the newly created stmt_vec_info (*) 8d402c3... AVR: Tidy up enum and struct tags. (*) 9f8e182... AVR: Partially revert r15-3623. (*) 719edcb... libstdc++: Update link to installation docs (*) 4f2cd25... Daily bump. (*) d204bee... fortran: Remove useless nested end of scalarization chain h (*) a9f9391... c++: __extension__ and -Wconditionally-supported (*) 5ef73ba... c++: conversion location (*) 2af87d9... libstdc++: Adjust std::span::iterator to be ADL-proof (*) 1dde83f... libstdc++: Enable most of for freestanding (*) f91fe35... libstdc++: Add assertion for valid facet type arguments (*) c5fd1a4... libstdc++: Make PSTL algorithms accept C++20 iterators [PR1 (*) 368ba7a... c++, coroutines: Fix handling of bool await_suspend() [PR11 (*) 6e4244e... phi-opt: Improve heuristics for factoring out with constant (*) 0b31335... vect: release defs of removed statement (*) d2f10fc... Mark the copy/move constructor/operator= of auto_bitmap as (*) e07fbc9... Daily bump. (*) 1dd6dd1... testsuite; Fix execute/pr52286.c for 16bit (*) 8b5e547... c++: avoid init_priority warning in system header (*) 005f717... c++: Don't mix timevar_start and auto_cond_timevar for TV_N (*) a900349... AVR: Use rtx code copysign. (*) 99b8be4... libstdc++: Tweak localized formatting for floating-point ty (*) 01670a4... libstdc++: Refactor loops in std::__platform_semaphore (*) 49cb715... testsuite: adjust pragma-diag-17.c diagnostics (*) bec1f2c... c++: Fix g++.dg/ext/sve-sizeless-1.C regression (*) (*) This commit already exists in another branch. Because the reference `refs/users/aoliva/heads/testbase' matches your hooks.email-new-commits-only configuration, no separate email is sent for this commit.
[gcc(refs/users/aoliva/heads/testme)] support noncontiguous ifcombine
https://gcc.gnu.org/g:8a7e9581280c41f3c18cba7fafe110b4108a07a7 commit 8a7e9581280c41f3c18cba7fafe110b4108a07a7 Author: Alexandre Oliva Date: Sat Sep 14 03:40:26 2024 -0300 support noncontiguous ifcombine Diff: --- gcc/tree-ssa-ifcombine.cc | 33 ++--- 1 file changed, 26 insertions(+), 7 deletions(-) diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc index eb4317bebdfb..b52d343feb91 100644 --- a/gcc/tree-ssa-ifcombine.cc +++ b/gcc/tree-ssa-ifcombine.cc @@ -782,13 +782,13 @@ tree_ssa_ifcombine_bb_1 (basic_block inner_cond_bb, basic_block outer_cond_bb, if-conversion helper. We start with BB as the innermost worker basic-block. Returns true if a transformation was done. */ -static bool +static basic_block tree_ssa_ifcombine_bb (basic_block inner_cond_bb) { basic_block then_bb = NULL, else_bb = NULL; if (!recognize_if_then_else (inner_cond_bb, &then_bb, &else_bb)) -return false; +return NULL; /* Recognize && and || of two conditions with a common then/else block which entry edges we can merge. That is: @@ -805,7 +805,7 @@ tree_ssa_ifcombine_bb (basic_block inner_cond_bb) if (tree_ssa_ifcombine_bb_1 (inner_cond_bb, outer_cond_bb, then_bb, else_bb, inner_cond_bb)) - return true; + return bb; if (forwarder_block_to (else_bb, then_bb)) { @@ -817,7 +817,7 @@ tree_ssa_ifcombine_bb (basic_block inner_cond_bb) edge from outer_cond_bb and the forwarder block. */ if (tree_ssa_ifcombine_bb_1 (inner_cond_bb, outer_cond_bb, else_bb, then_bb, else_bb)) - return true; + return bb; } else if (forwarder_block_to (then_bb, else_bb)) { @@ -829,11 +829,11 @@ tree_ssa_ifcombine_bb (basic_block inner_cond_bb) edge from outer_cond_bb and the forwarder block. */ if (tree_ssa_ifcombine_bb_1 (inner_cond_bb, outer_cond_bb, else_bb, then_bb, then_bb)) - return true; + return bb; } } - return false; + return NULL; } /* Main entry for the tree if-conversion pass. */ @@ -884,12 +884,14 @@ pass_tree_ifcombine::execute (function *fun) inner ones, and also that we do not try to visit a removed block. This is opposite of PHI-OPT, because we cascade the combining rather than cascading PHIs. */ + basic_block seen = NULL; + bool changed = false; for (i = n_basic_blocks_for_fn (fun) - NUM_FIXED_BLOCKS - 1; i >= 0; i--) { basic_block bb = bbs[i]; if (safe_is_a (*gsi_last_bb (bb))) - if (tree_ssa_ifcombine_bb (bb)) + if (basic_block outer_bb = tree_ssa_ifcombine_bb (bb)) { /* Clear range info from all stmts in BB which is now executed conditional on a always true/false condition. */ @@ -908,7 +910,24 @@ pass_tree_ifcombine::execute (function *fun) rewrite_to_defined_overflow (&gsi); } cfg_changed |= true; + if (seen) + changed |= true; + else + seen = bb; + /* Go back and check whether the modified outer_bb can be further + optimized. ??? How could it? */ + do + i++; + while (bbs[i] != outer_bb); + continue; } + + if (bb == seen) + { + gcc_assert (!changed); + seen = NULL; + changed = false; + } } free (bbs);
[gcc(refs/users/aoliva/heads/testme)] support noncontiguous ifcombine
https://gcc.gnu.org/g:4f6753de737fb45d78634c35c4c50a546357f70d commit 4f6753de737fb45d78634c35c4c50a546357f70d Author: Alexandre Oliva Date: Sat Sep 14 03:40:26 2024 -0300 support noncontiguous ifcombine Diff: --- gcc/tree-ssa-ifcombine.cc | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc index 7678c87e0170..eb4317bebdfb 100644 --- a/gcc/tree-ssa-ifcombine.cc +++ b/gcc/tree-ssa-ifcombine.cc @@ -798,10 +798,10 @@ tree_ssa_ifcombine_bb (basic_block inner_cond_bb) if (a && b) ; This requires a single predecessor of the inner cond_bb. */ - if (single_pred_p (inner_cond_bb) - && bb_no_side_effects_p (inner_cond_bb)) + for (basic_block bb = inner_cond_bb; + single_pred_p (bb) && bb_no_side_effects_p (bb); ) { - basic_block outer_cond_bb = single_pred (inner_cond_bb); + basic_block outer_cond_bb = bb = single_pred (bb); if (tree_ssa_ifcombine_bb_1 (inner_cond_bb, outer_cond_bb, then_bb, else_bb, inner_cond_bb))
[gcc(refs/users/aoliva/heads/testme)] relax ifcombine to accept vuses
https://gcc.gnu.org/g:7b7dfff4b174765248fcf275dc8fde9c78352ad8 commit 7b7dfff4b174765248fcf275dc8fde9c78352ad8 Author: Alexandre Oliva Date: Fri Sep 13 21:43:15 2024 -0300 relax ifcombine to accept vuses Diff: --- gcc/config/i386/t-i386 | 2 ++ gcc/testsuite/gcc.dg/field-merge-6.c | 26 ++ gcc/tree-ssa-ifcombine.cc| 2 +- 3 files changed, 29 insertions(+), 1 deletion(-) diff --git a/gcc/config/i386/t-i386 b/gcc/config/i386/t-i386 index bf4ae109af98..1b904787ec62 100644 --- a/gcc/config/i386/t-i386 +++ b/gcc/config/i386/t-i386 @@ -79,3 +79,5 @@ s-i386-bt: $(srcdir)/config/i386/i386-builtin-types.awk \ $(AWK) -f $^ > tmp-bt.inc $(SHELL) $(srcdir)/../move-if-change tmp-bt.inc i386-builtin-types.inc $(STAMP) $@ + +insn-attrtab.o-warn = -Wno-error diff --git a/gcc/testsuite/gcc.dg/field-merge-6.c b/gcc/testsuite/gcc.dg/field-merge-6.c new file mode 100644 index ..7fd48a138d14 --- /dev/null +++ b/gcc/testsuite/gcc.dg/field-merge-6.c @@ -0,0 +1,26 @@ +/* { dg-do run } */ +/* { dg-options "-O" } */ +/* { dg-shouldfail } */ + +/* Check that the third compare won't be pulled ahead of the second one and + prevent, which would prevent the NULL pointer dereference that should cause + the execution to fail. */ + +struct s { + char a, b; + int *p; +}; + +struct s a = { 0, 1, 0 }; +struct s b = { 0, 0, 0 }; + +int f () { + return (a.a != b.a + || *b.p != *a.p + || a.b != b.b); +} + +int main() { + f (); + return 0; +} diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc index 61480e5fa894..7678c87e0170 100644 --- a/gcc/tree-ssa-ifcombine.cc +++ b/gcc/tree-ssa-ifcombine.cc @@ -129,7 +129,7 @@ bb_no_side_effects_p (basic_block bb) enum tree_code rhs_code; if (gimple_has_side_effects (stmt) || gimple_could_trap_p (stmt) - || gimple_vuse (stmt) + /* || gimple_vuse (stmt) */ /* We need to rewrite stmts with undefined overflow to use unsigned arithmetic but cannot do so for signed division. */ || ((ass = dyn_cast (stmt))
[gcc(refs/users/aoliva/heads/testme)] fold truth-and only in ifcombine
https://gcc.gnu.org/g:e731ae8c98953fb898a938ee0a0be19e2ea906d7 commit e731ae8c98953fb898a938ee0a0be19e2ea906d7 Author: Alexandre Oliva Date: Fri Sep 13 21:43:10 2024 -0300 fold truth-and only in ifcombine Diff: --- gcc/gimple-fold.cc| 2 ++ gcc/tree-ssa-ifcombine.cc | 24 +--- 2 files changed, 23 insertions(+), 3 deletions(-) diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc index 85a0ec028030..5b7d83edbea9 100644 --- a/gcc/gimple-fold.cc +++ b/gcc/gimple-fold.cc @@ -8738,12 +8738,14 @@ maybe_fold_and_comparisons (tree type, op2b, outer_cond_bb)) return t; +#if 0 if (tree t = fold_truth_andor_maybe_separate (UNKNOWN_LOCATION, TRUTH_ANDIF_EXPR, type, code2, op2a, op2b, code1, op1a, op1b, NULL)) return t; +#endif return NULL_TREE; } diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc index 79a4bdd363b9..61480e5fa894 100644 --- a/gcc/tree-ssa-ifcombine.cc +++ b/gcc/tree-ssa-ifcombine.cc @@ -399,6 +399,14 @@ update_profile_after_ifcombine (basic_block inner_cond_bb, outer2->probability = profile_probability::never (); } +/* FIXME: move to a header file. */ +extern tree +fold_truth_andor_maybe_separate (location_t loc, +enum tree_code code, tree truth_type, +enum tree_code lcode, tree ll_arg, tree lr_arg, +enum tree_code rcode, tree rl_arg, tree rr_arg, +tree *separatep); + /* If-convert on a and pattern with a common else block. The inner if is specified by its INNER_COND_BB, the outer by OUTER_COND_BB. inner_inv, outer_inv and result_inv indicate whether the conditions @@ -576,7 +584,7 @@ ifcombine_ifandif (basic_block inner_cond_bb, bool inner_inv, else if (TREE_CODE_CLASS (gimple_cond_code (inner_cond)) == tcc_comparison && TREE_CODE_CLASS (gimple_cond_code (outer_cond)) == tcc_comparison) { - tree t; + tree t, ts = NULL_TREE; enum tree_code inner_cond_code = gimple_cond_code (inner_cond); enum tree_code outer_cond_code = gimple_cond_code (outer_cond); @@ -599,7 +607,17 @@ ifcombine_ifandif (basic_block inner_cond_bb, bool inner_inv, outer_cond_code, gimple_cond_lhs (outer_cond), gimple_cond_rhs (outer_cond), - gimple_bb (outer_cond + gimple_bb (outer_cond))) + && !(t = ts = (fold_truth_andor_maybe_separate +(UNKNOWN_LOCATION, TRUTH_ANDIF_EXPR, + boolean_type_node, + outer_cond_code, + gimple_cond_lhs (outer_cond), + gimple_cond_rhs (outer_cond), + inner_cond_code, + gimple_cond_lhs (inner_cond), + gimple_cond_rhs (inner_cond), + NULL { { tree t1, t2; @@ -636,7 +654,7 @@ ifcombine_ifandif (basic_block inner_cond_bb, bool inner_inv, NULL, true, GSI_SAME_STMT); } /* ??? Fold should avoid this. */ - else if (!is_gimple_condexpr_for_cond (t)) + else if (ts && !is_gimple_condexpr_for_cond (t)) goto gimplify_after_fold; if (result_inv) t = fold_build1 (TRUTH_NOT_EXPR, TREE_TYPE (t), t);
[gcc(refs/users/aoliva/heads/testme)] check for mergeable loads, choose insertion points accordingly
https://gcc.gnu.org/g:fbf1f8007325adf2d4b70f3b5a26d5c666c815e3 commit fbf1f8007325adf2d4b70f3b5a26d5c666c815e3 Author: Alexandre Oliva Date: Fri Sep 13 21:43:06 2024 -0300 check for mergeable loads, choose insertion points accordingly Diff: --- gcc/gimple-fold.cc | 253 ++--- 1 file changed, 219 insertions(+), 34 deletions(-) diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc index 64426bd76977..85a0ec028030 100644 --- a/gcc/gimple-fold.cc +++ b/gcc/gimple-fold.cc @@ -69,6 +69,7 @@ along with GCC; see the file COPYING3. If not see #include "varasm.h" #include "internal-fn.h" #include "gimple-range.h" +#include "tree-ssa-loop-niter.h" // stmt_dominates_stmt_p /* ??? Move this to some header, it's defined in fold-const.c. */ extern tree @@ -7395,7 +7396,7 @@ maybe_fold_comparisons_from_match_pd (tree type, enum tree_code code, Same as ssa_is_replaceable_p, except that we don't insist it has a single use. */ -bool +static bool ssa_is_substitutable_p (gimple *stmt) { #if 0 @@ -7476,9 +7477,10 @@ is_cast_p (tree *name) if (gimple_num_ops (def) != 2) break; - if (get_gimple_rhs_class (gimple_expr_code (def)) - == GIMPLE_SINGLE_RHS) + if (gimple_assign_single_p (def)) { + if (gimple_assign_load_p (def)) + break; *name = gimple_assign_rhs1 (def); continue; } @@ -7515,8 +7517,7 @@ is_binop_p (enum tree_code code, tree *name) return 0; case 2: - if (get_gimple_rhs_class (gimple_expr_code (def)) - == GIMPLE_SINGLE_RHS) + if (gimple_assign_single_p (def) && !gimple_assign_load_p (def)) { *name = gimple_assign_rhs1 (def); continue; @@ -7524,7 +7525,7 @@ is_binop_p (enum tree_code code, tree *name) return 0; case 3: - ; + break; } if (gimple_assign_rhs_code (def) != code) @@ -7569,6 +7570,26 @@ prepare_xor (tree l_arg, tree *r_arg) return ret; } +/* If EXP is a SSA_NAME whose DEF is a load stmt, set *LOAD to it and + return its RHS, otherwise return EXP. */ + +static tree +follow_load (tree exp, gimple **load) +{ + if (TREE_CODE (exp) == SSA_NAME + && !SSA_NAME_IS_DEFAULT_DEF (exp)) +{ + gimple *def = SSA_NAME_DEF_STMT (exp); + if (gimple_assign_load_p (def)) + { + *load = def; + exp = gimple_assign_rhs1 (def); + } +} + + return exp; +} + /* Subroutine for fold_truth_andor_1: decode a field reference. If EXP is a comparison reference, we return the innermost reference. @@ -7595,6 +7616,9 @@ prepare_xor (tree l_arg, tree *r_arg) BIT_XOR_EXPR compared with zero. We're to take the first or second operand thereof if so. It should be zero otherwise. + *LOAD is set to the load stmt of the innermost reference, if any, + *and NULL otherwise. + Return 0 if this is not a component reference or is one that we can't do anything with. */ @@ -7602,7 +7626,8 @@ static tree decode_field_reference (location_t loc, tree *exp_, HOST_WIDE_INT *pbitsize, HOST_WIDE_INT *pbitpos, machine_mode *pmode, int *punsignedp, int *preversep, int *pvolatilep, - tree *pmask, tree *pand_mask, int xor_which) + tree *pmask, tree *pand_mask, int xor_which, + gimple **load) { tree exp = *exp_; tree outer_type = 0; @@ -7612,11 +7637,13 @@ decode_field_reference (location_t loc, tree *exp_, HOST_WIDE_INT *pbitsize, unsigned int precision; HOST_WIDE_INT shiftrt = 0; + *load = NULL; + /* All the optimizations using this function assume integer fields. There are problems with FP fields since the type_for_size call below can fail for, e.g., XFmode. */ if (! INTEGRAL_TYPE_P (TREE_TYPE (exp))) -return 0; +return NULL_TREE; /* We are interested in the bare arrangement of bits, so strip everything that doesn't affect the machine mode. However, record the type of the @@ -7626,7 +7653,7 @@ decode_field_reference (location_t loc, tree *exp_, HOST_WIDE_INT *pbitsize, if ((and_mask = is_binop_p (BIT_AND_EXPR, &exp))) { if (TREE_CODE (and_mask) != INTEGER_CST) - return 0; + return NULL_TREE; } if (xor_which) @@ -7644,16 +7671,18 @@ decode_field_reference (location_t loc, tree *exp_, HOST_WIDE_INT *pbitsize, if (tree shift = is_binop_p (RSHIFT_EXPR, &exp)) { if (TREE_CODE (shift) != INTEGER_CST || !tree_fits_shwi_p (shift)) - return 0; + return NULL_TREE; shiftrt = tree_to_shwi (shift); if (shiftrt <= 0) - return 0; + return NULL_TREE; } if (tree t = is_cast_p (&exp)) if (!outer_type) outer_type = t; + exp = follow_load (exp, load); + poly_int64 poly_bitsize, poly_bitpos; inner = ge
[gcc(refs/users/aoliva/heads/testme)] rework truth_andor folding into tree-ssa-ifcombine
https://gcc.gnu.org/g:b4b872b2195b448b3e1bcd28c3e28d59618580a2 commit b4b872b2195b448b3e1bcd28c3e28d59618580a2 Author: Alexandre Oliva Date: Fri Sep 13 21:43:00 2024 -0300 rework truth_andor folding into tree-ssa-ifcombine Diff: --- gcc/fold-const.cc | 1048 + gcc/gimple-fold.cc| 1149 + gcc/tree-ssa-ifcombine.cc |7 +- 3 files changed, 1170 insertions(+), 1034 deletions(-) diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc index 81814de5b04b..19824e6a477f 100644 --- a/gcc/fold-const.cc +++ b/gcc/fold-const.cc @@ -137,7 +137,6 @@ static tree range_successor (tree); static tree fold_range_test (location_t, enum tree_code, tree, tree, tree); static tree fold_cond_expr_with_comparison (location_t, tree, enum tree_code, tree, tree, tree, tree); -static tree unextend (tree, int, int, tree); static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *); static tree extract_muldiv_1 (tree, tree, enum tree_code, tree, bool *); static tree fold_binary_op_with_conditional_arg (location_t, @@ -4695,7 +4694,7 @@ invert_truthvalue_loc (location_t loc, tree arg) is the original memory reference used to preserve the alias set of the access. */ -static tree +tree make_bit_field_ref (location_t loc, tree inner, tree orig_inner, tree type, HOST_WIDE_INT bitsize, poly_int64 bitpos, int unsignedp, int reversep) @@ -4945,212 +4944,6 @@ optimize_bit_field_compare (location_t loc, enum tree_code code, return lhs; } -/* If *R_ARG is a constant zero, and L_ARG is a possibly masked - BIT_XOR_EXPR, return 1 and set *r_arg to l_arg. - Otherwise, return 0. - - The returned value should be passed to decode_field_reference for it - to handle l_arg, and then doubled for r_arg. */ -static int -prepare_xor (tree l_arg, tree *r_arg) -{ - int ret = 0; - - if (!integer_zerop (*r_arg)) -return ret; - - tree exp = l_arg; - STRIP_NOPS (exp); - - if (TREE_CODE (exp) == BIT_AND_EXPR) -{ - tree and_mask = TREE_OPERAND (exp, 1); - exp = TREE_OPERAND (exp, 0); - STRIP_NOPS (exp); STRIP_NOPS (and_mask); - if (TREE_CODE (and_mask) != INTEGER_CST) - return ret; -} - - if (TREE_CODE (exp) == BIT_XOR_EXPR) -{ - *r_arg = l_arg; - return 1; -} - - return ret; -} - -/* Subroutine for fold_truth_andor_1: decode a field reference. - - If EXP is a comparison reference, we return the innermost reference. - - *PBITSIZE is set to the number of bits in the reference, *PBITPOS is - set to the starting bit number. - - If the innermost field can be completely contained in a mode-sized - unit, *PMODE is set to that mode. Otherwise, it is set to VOIDmode. - - *PVOLATILEP is set to 1 if the any expression encountered is volatile; - otherwise it is not changed. - - *PUNSIGNEDP is set to the signedness of the field. - - *PREVERSEP is set to the storage order of the field. - - *PMASK is set to the mask used. This is either contained in a - BIT_AND_EXPR or derived from the width of the field. - - *PAND_MASK is set to the mask found in a BIT_AND_EXPR, if any. - - XOR_WHICH is 1 or 2 if EXP was found to be a (possibly masked) - BIT_XOR_EXPR compared with zero. We're to take the first or second - operand thereof if so. It should be zero otherwise. - - Return 0 if this is not a component reference or is one that we can't - do anything with. */ - -static tree -decode_field_reference (location_t loc, tree *exp_, HOST_WIDE_INT *pbitsize, - HOST_WIDE_INT *pbitpos, machine_mode *pmode, - int *punsignedp, int *preversep, int *pvolatilep, - tree *pmask, tree *pand_mask, int xor_which) -{ - tree exp = *exp_; - tree outer_type = 0; - tree and_mask = 0; - tree mask, inner, offset; - tree unsigned_type; - unsigned int precision; - HOST_WIDE_INT shiftrt = 0; - - /* All the optimizations using this function assume integer fields. - There are problems with FP fields since the type_for_size call - below can fail for, e.g., XFmode. */ - if (! INTEGRAL_TYPE_P (TREE_TYPE (exp))) -return NULL_TREE; - - /* We are interested in the bare arrangement of bits, so strip everything - that doesn't affect the machine mode. However, record the type of the - outermost expression if it may matter below. */ - if (CONVERT_EXPR_P (exp) - || TREE_CODE (exp) == NON_LVALUE_EXPR) -outer_type = TREE_TYPE (exp); - STRIP_NOPS (exp); - - if (TREE_CODE (exp) == BIT_AND_EXPR) -{ - and_mask = TREE_OPERAND (exp, 1); - exp = TREE_OPERAND (exp, 0); - STRIP_NOPS (exp); STRIP_NOPS (and_mask); - if (TREE_CODE (and_mask) != INTEGER_CST) - return NULL_TREE; -} - - if (xor_which) -{ - gcc_checking_assert (TREE_CODE (exp) == BIT_XOR_EXPR); -
[gcc(refs/users/aoliva/heads/testme)] assorted improvements for fold_truth_andor_1
https://gcc.gnu.org/g:8aa412b62ba7275d5d0f861ef8e0306d5023b028 commit 8aa412b62ba7275d5d0f861ef8e0306d5023b028 Author: Alexandre Oliva Date: Fri Sep 13 21:42:56 2024 -0300 assorted improvements for fold_truth_andor_1 This patch introduces various improvements to the logic that merges field compares. Before the patch, we could merge: (a.x1 EQNE b.x1) ANDOR (a.y1 EQNE b.y1) into something like: (((type *)&a)[Na] & MASK) EQNE (((type *)&b)[Nb] & MASK) if both of A's fields live within the same alignment boundaries, and so do B's, at the same relative positions. Constants may be used instead of the object B. The initial goal of this patch was to enable such combinations when a field crossed alignment boundaries, e.g. for packed types. We can't generally access such fields with a single memory access, so when we come across such a compare, we will attempt to combine each access separately. Some merging opportunities were missed because of right-shifts, compares expressed as e.g. ((a.x1 ^ b.x1) & MASK) EQNE 0, and narrowing conversions, especially after earlier merges. This patch introduces handlers for several cases involving these. Other merging opportunities were missed because of association. The existing logic would only succeed in merging a pair of consecutive compares, or e.g. B with C in (A ANDOR B) ANDOR C, not even trying e.g. C and D in (A ANDOR (B ANDOR C)) ANDOR D. I've generalized the handling of the rightmost compare in the left-hand operand, going for the leftmost compare in the right-hand operand, and then onto trying to merge compares pairwise, one from each operand, even if they are not consecutive, taking care to avoid merging operations with intervening side effects, including volatile accesses. When it is the second of a non-consecutive pair of compares that first accesses a word, we may merge the first compare with part of the second compare that refers to the same word, keeping the compare of the remaining bits at the spot where the second compare used to be. Handling compares with non-constant fields was somewhat generalized, now handling non-adjacent fields. When a field of one object crosses an alignment boundary but the other doesn't, we issue the same load in both compares; gimple optimizers will later turn it into a single load, without our having to handle SAVE_EXPRs at this point. The logic for issuing split loads and compares, and ordering them, is now shared between all cases of compares with constants and with another object. The -Wno-error for toplev.o on rs6000 is because of toplev.c's: if ((flag_sanitize & SANITIZE_ADDRESS) && !FRAME_GROWS_DOWNWARD) and rs6000.h's: #define FRAME_GROWS_DOWNWARD (flag_stack_protect != 0 \ || (flag_sanitize & SANITIZE_ADDRESS) != 0) The mutually exclusive conditions involving flag_sanitize are now noticed and reported by fold-const.c's: warning (0, "% of mutually exclusive equal-tests" " is always 0"); This patch enables over 12k compare-merging opportunities that we used to miss in a GCC bootstrap. for gcc/ChangeLog * fold-const.cc (prepare_xor): New. (decode_field_reference): Handle xor, shift, and narrowing conversions. (all_ones_mask_p): Remove. (compute_split_boundary_from_align): New. (build_split_load, reuse_split_load): New. (fold_truth_andor_1): Add recursion to combine pairs of non-neighboring compares. Handle xor compared with zero. Handle fields straddling across alignment boundaries. Generalize handling of non-constant rhs. (fold_truth_andor): Leave sub-expression handling to the recursion above. * config/rs6000/t-rs6000 (toplev.o-warn): Disable errors. for gcc/testsuite/ChangeLog * gcc.dg/field-merge-1.c: New. * gcc.dg/field-merge-2.c: New. * gcc.dg/field-merge-3.c: New. * gcc.dg/field-merge-4.c: New. * gcc.dg/field-merge-5.c: New. Diff: --- gcc/config/rs6000/t-rs6000 | 4 + gcc/fold-const.cc| 818 --- gcc/testsuite/gcc.dg/field-merge-1.c | 64 +++ gcc/testsuite/gcc.dg/field-merge-2.c | 31 ++ gcc/testsuite/gcc.dg/field-merge-3.c | 36 ++ gcc/testsuite/gcc.dg/field-merge-4.c | 40 ++ gcc/testsuite/gcc.dg/field-merge-5.c | 40 ++ 7 files changed, 881 insertions(+), 152 deletions(-) diff --git a/gcc/config/rs6000/t-rs6000 b/gcc/config/rs6000/t-rs6000 index 155788de40a3..a83968d663a6
[gcc/aoliva/heads/testme] (24 commits) relax ifcombine to accept vuses
The branch 'aoliva/heads/testme' was updated to point to: 7b7dfff4b174... relax ifcombine to accept vuses It previously pointed to: 0d90ad11fb42... relax ifcombine to accept vuses Diff: !!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST): --- 0d90ad1... relax ifcombine to accept vuses 1494b67... fold truth-and only in ifcombine 163a769... check for mergeable loads, choose insertion points accordin e4f6196... rework truth_andor folding into tree-ssa-ifcombine 6621257... assorted improvements for fold_truth_andor_1 90eb457... testsuite: a few more hostedlib adjustments Summary of changes (added commits): --- 7b7dfff... relax ifcombine to accept vuses e731ae8... fold truth-and only in ifcombine fbf1f80... check for mergeable loads, choose insertion points accordin b4b872b... rework truth_andor folding into tree-ssa-ifcombine 8aa412b... assorted improvements for fold_truth_andor_1 b56bd54... testsuite: a few more hostedlib adjustments (*) d53c5bc... Daily bump. (*) 4ffca99... AVR: Detect more skip opportunities. (*) 8ed8c34... Fix factor_out_conditional_operation heuristics for constan (*) b55f5e3... AVR: Use avr_byte instead of simplify_gen_subreg (QImode, . (*) 4ee6923... c++: -fimplicit-constexpr diagnostic improvement [PR116696] (*) 9998846... Fortran: Fixes to OpenMP 'interop' directive parsing suppor (*) 508ef58... gcn/mkoffload.cc: Use #embed for including the generated EL (*) b7b6773... c++: Don't emit deprecated/unavailable attribute diagnostic (*) 4963eb7... libcpp: Fix up UB in finish_embed (*) 46c2538... s390: Fix TF to FPRX2 conversion [PR115860] (*) 1a71ff3... s390: Fix AQ and AR constraints (*) 5938e06... libstdc++: Do not use use memmove for 1-element ranges [PR1 (*) 494d3c3... AVR: Rework avr_out_compare. (*) 1ec1677... AVR: Tweak 32-bit EQ and NE comparisons. (*) be59aaf... AVR: avr.cc - Reorder functions to require less forward dec (*) 45e7cc9... Match: Remove unnecessary types_match for case 1 of signed (*) 5d9486c... Fix endianness issue on unsigned_21.f90. (*) 3d021a0... Daily bump. (*) (*) This commit already exists in another branch. Because the reference `refs/users/aoliva/heads/testme' matches your hooks.email-new-commits-only configuration, no separate email is sent for this commit.
[gcc/aoliva/heads/testbase] (19 commits) testsuite: a few more hostedlib adjustments
The branch 'aoliva/heads/testbase' was updated to point to: b56bd542942b... testsuite: a few more hostedlib adjustments It previously pointed to: 4308c343b8ea... testsuite: introduce hostedlib effective target Diff: Summary of changes (added commits): --- b56bd54... testsuite: a few more hostedlib adjustments (*) d53c5bc... Daily bump. (*) 4ffca99... AVR: Detect more skip opportunities. (*) 8ed8c34... Fix factor_out_conditional_operation heuristics for constan (*) b55f5e3... AVR: Use avr_byte instead of simplify_gen_subreg (QImode, . (*) 4ee6923... c++: -fimplicit-constexpr diagnostic improvement [PR116696] (*) 9998846... Fortran: Fixes to OpenMP 'interop' directive parsing suppor (*) 508ef58... gcn/mkoffload.cc: Use #embed for including the generated EL (*) b7b6773... c++: Don't emit deprecated/unavailable attribute diagnostic (*) 4963eb7... libcpp: Fix up UB in finish_embed (*) 46c2538... s390: Fix TF to FPRX2 conversion [PR115860] (*) 1a71ff3... s390: Fix AQ and AR constraints (*) 5938e06... libstdc++: Do not use use memmove for 1-element ranges [PR1 (*) 494d3c3... AVR: Rework avr_out_compare. (*) 1ec1677... AVR: Tweak 32-bit EQ and NE comparisons. (*) be59aaf... AVR: avr.cc - Reorder functions to require less forward dec (*) 45e7cc9... Match: Remove unnecessary types_match for case 1 of signed (*) 5d9486c... Fix endianness issue on unsigned_21.f90. (*) 3d021a0... Daily bump. (*) (*) This commit already exists in another branch. Because the reference `refs/users/aoliva/heads/testbase' matches your hooks.email-new-commits-only configuration, no separate email is sent for this commit.
[gcc r15-3636] testsuite: a few more hostedlib adjustments
https://gcc.gnu.org/g:b56bd542942ba7bd2020d5824e57d819974bc071 commit r15-3636-gb56bd542942ba7bd2020d5824e57d819974bc071 Author: Alexandre Oliva Date: Fri Sep 13 21:42:41 2024 -0300 testsuite: a few more hostedlib adjustments This adjusts some recently-added tests that won't compile without a hostedlib libstdc++, missed in the patch that just went in, and also an old test that I'd missed because it also failed in my baseline. for gcc/testsuite/ChangeLog * g++.dg/coroutines/pr108620.C: Skip if !hostedlib because of unavailable headers. * g++.dg/other/profile1.C: Likewise. * g++.dg/ext/pragma-unroll-lambda-lto.C: Skip if !hostedlib because of unavailable declarations. Diff: --- gcc/testsuite/g++.dg/coroutines/pr108620.C | 2 ++ gcc/testsuite/g++.dg/ext/pragma-unroll-lambda-lto.C | 1 + gcc/testsuite/g++.dg/other/profile1.C | 1 + 3 files changed, 4 insertions(+) diff --git a/gcc/testsuite/g++.dg/coroutines/pr108620.C b/gcc/testsuite/g++.dg/coroutines/pr108620.C index e8016b9f8a23..22bf0c18bac4 100644 --- a/gcc/testsuite/g++.dg/coroutines/pr108620.C +++ b/gcc/testsuite/g++.dg/coroutines/pr108620.C @@ -1,3 +1,5 @@ +// { dg-skip-if "requires hosted libstdc++ for iostream" { ! hostedlib } } + // https://gcc.gnu.org/PR108620 #include #include diff --git a/gcc/testsuite/g++.dg/ext/pragma-unroll-lambda-lto.C b/gcc/testsuite/g++.dg/ext/pragma-unroll-lambda-lto.C index 144c4c326924..64cdf90f34d3 100644 --- a/gcc/testsuite/g++.dg/ext/pragma-unroll-lambda-lto.C +++ b/gcc/testsuite/g++.dg/ext/pragma-unroll-lambda-lto.C @@ -1,5 +1,6 @@ // { dg-do link { target c++11 } } // { dg-options "-O2 -flto -fdump-rtl-loop2_unroll" } +// { dg-skip-if "requires hosted libstdc++ for cstdlib rand" { ! hostedlib } } #include diff --git a/gcc/testsuite/g++.dg/other/profile1.C b/gcc/testsuite/g++.dg/other/profile1.C index a4bf6b3d0fea..99844373189e 100644 --- a/gcc/testsuite/g++.dg/other/profile1.C +++ b/gcc/testsuite/g++.dg/other/profile1.C @@ -2,6 +2,7 @@ // { dg-do run } // { dg-require-profiling "" } // { dg-options "-fnon-call-exceptions -fprofile-arcs" } +// { dg-skip-if "requires hosted libstdc++ for string" { ! hostedlib } } #include
[gcc(refs/users/aoliva/heads/testme)] relax ifcombine to accept vuses
https://gcc.gnu.org/g:0d90ad11fb42573a4186c69117edc892f6fb9151 commit 0d90ad11fb42573a4186c69117edc892f6fb9151 Author: Alexandre Oliva Date: Fri Sep 13 02:13:50 2024 -0300 relax ifcombine to accept vuses Diff: --- gcc/config/i386/t-i386 | 2 ++ gcc/testsuite/gcc.dg/field-merge-6.c | 25 + gcc/tree-ssa-ifcombine.cc| 2 +- 3 files changed, 28 insertions(+), 1 deletion(-) diff --git a/gcc/config/i386/t-i386 b/gcc/config/i386/t-i386 index bf4ae109af98..1b904787ec62 100644 --- a/gcc/config/i386/t-i386 +++ b/gcc/config/i386/t-i386 @@ -79,3 +79,5 @@ s-i386-bt: $(srcdir)/config/i386/i386-builtin-types.awk \ $(AWK) -f $^ > tmp-bt.inc $(SHELL) $(srcdir)/../move-if-change tmp-bt.inc i386-builtin-types.inc $(STAMP) $@ + +insn-attrtab.o-warn = -Wno-error diff --git a/gcc/testsuite/gcc.dg/field-merge-6.c b/gcc/testsuite/gcc.dg/field-merge-6.c new file mode 100644 index ..4b09623f138a --- /dev/null +++ b/gcc/testsuite/gcc.dg/field-merge-6.c @@ -0,0 +1,25 @@ +/* { dg-do run } */ +/* { dg-options "-O" } */ +/* { dg-shouldfail } */ + +struct s { + char a, b; + struct s *p; +}; + +struct s a = { 0, 1, 0 }; +struct s b = { 0, 0, &a }; + +int f () { + /* Check that the third compare won't be pulled ahead of the second one and + prevent the NULL pointer dereference that should cause the execution to + fail. */ + return (a.a != b.a + || b.p->b != a.p->a + || a.b != b.b); +} + +int main() { + f (); + return 0; +} diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc index 61480e5fa894..7678c87e0170 100644 --- a/gcc/tree-ssa-ifcombine.cc +++ b/gcc/tree-ssa-ifcombine.cc @@ -129,7 +129,7 @@ bb_no_side_effects_p (basic_block bb) enum tree_code rhs_code; if (gimple_has_side_effects (stmt) || gimple_could_trap_p (stmt) - || gimple_vuse (stmt) + /* || gimple_vuse (stmt) */ /* We need to rewrite stmts with undefined overflow to use unsigned arithmetic but cannot do so for signed division. */ || ((ass = dyn_cast (stmt))
[gcc/aoliva/heads/testme] relax ifcombine to accept vuses
The branch 'aoliva/heads/testme' was updated to point to: 0d90ad11fb42... relax ifcombine to accept vuses It previously pointed to: cfc10ee04079... relax ifcombine to accept vuses Diff: !!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST): --- cfc10ee... relax ifcombine to accept vuses Summary of changes (added commits): --- 0d90ad1... relax ifcombine to accept vuses
[gcc(refs/users/aoliva/heads/testme)] relax ifcombine to accept vuses
https://gcc.gnu.org/g:cfc10ee040798637c72487435d0ab8668e05c386 commit cfc10ee040798637c72487435d0ab8668e05c386 Author: Alexandre Oliva Date: Fri Sep 13 02:13:50 2024 -0300 relax ifcombine to accept vuses Diff: --- gcc/config/i386/t-i386 | 2 ++ gcc/testsuite/gcc.dg/field-merge-6.c | 25 + gcc/tree-ssa-ifcombine.cc| 2 +- 3 files changed, 28 insertions(+), 1 deletion(-) diff --git a/gcc/config/i386/t-i386 b/gcc/config/i386/t-i386 index bf4ae109af98..1b904787ec62 100644 --- a/gcc/config/i386/t-i386 +++ b/gcc/config/i386/t-i386 @@ -79,3 +79,5 @@ s-i386-bt: $(srcdir)/config/i386/i386-builtin-types.awk \ $(AWK) -f $^ > tmp-bt.inc $(SHELL) $(srcdir)/../move-if-change tmp-bt.inc i386-builtin-types.inc $(STAMP) $@ + +insn-attrtab.o-warn = -Wno-error diff --git a/gcc/testsuite/gcc.dg/field-merge-6.c b/gcc/testsuite/gcc.dg/field-merge-6.c new file mode 100644 index ..c42bed927c66 --- /dev/null +++ b/gcc/testsuite/gcc.dg/field-merge-6.c @@ -0,0 +1,25 @@ +/* { dg-do run } */ +/* { dg-options "-O" } */ +/* { dg-shouldfail } */ + +struct s { + char a, b; + struct s *p; +}; + +struct s a = { 0, 1, 0 }; +struct s b = { 0, 0, &a }; + +int f () { + /* Check that the third compare won't be pulled ahead of the second one and + prevent the NULL pointer dereference that should cause the execution to + fail. */ + return (a->a != b->a + || b->p->b != a->p->a + || a->b != b->b); +} + +int main() { + f (); + return 0; +} diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc index 61480e5fa894..7678c87e0170 100644 --- a/gcc/tree-ssa-ifcombine.cc +++ b/gcc/tree-ssa-ifcombine.cc @@ -129,7 +129,7 @@ bb_no_side_effects_p (basic_block bb) enum tree_code rhs_code; if (gimple_has_side_effects (stmt) || gimple_could_trap_p (stmt) - || gimple_vuse (stmt) + /* || gimple_vuse (stmt) */ /* We need to rewrite stmts with undefined overflow to use unsigned arithmetic but cannot do so for signed division. */ || ((ass = dyn_cast (stmt))
[gcc/aoliva/heads/testme] relax ifcombine to accept vuses
The branch 'aoliva/heads/testme' was updated to point to: cfc10ee04079... relax ifcombine to accept vuses It previously pointed to: 70860fa4cab8... relax ifcombine to accept vuses Diff: !!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST): --- 70860fa... relax ifcombine to accept vuses Summary of changes (added commits): --- cfc10ee... relax ifcombine to accept vuses
[gcc(refs/users/aoliva/heads/testme)] relax ifcombine to accept vuses
https://gcc.gnu.org/g:70860fa4cab8ad90645027582ac5775716495819 commit 70860fa4cab8ad90645027582ac5775716495819 Author: Alexandre Oliva Date: Fri Sep 13 02:13:50 2024 -0300 relax ifcombine to accept vuses Diff: --- gcc/config/i386/t-i386| 2 ++ gcc/tree-ssa-ifcombine.cc | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/gcc/config/i386/t-i386 b/gcc/config/i386/t-i386 index bf4ae109af98..1b904787ec62 100644 --- a/gcc/config/i386/t-i386 +++ b/gcc/config/i386/t-i386 @@ -79,3 +79,5 @@ s-i386-bt: $(srcdir)/config/i386/i386-builtin-types.awk \ $(AWK) -f $^ > tmp-bt.inc $(SHELL) $(srcdir)/../move-if-change tmp-bt.inc i386-builtin-types.inc $(STAMP) $@ + +insn-attrtab.o-warn = -Wno-error diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc index 61480e5fa894..7678c87e0170 100644 --- a/gcc/tree-ssa-ifcombine.cc +++ b/gcc/tree-ssa-ifcombine.cc @@ -129,7 +129,7 @@ bb_no_side_effects_p (basic_block bb) enum tree_code rhs_code; if (gimple_has_side_effects (stmt) || gimple_could_trap_p (stmt) - || gimple_vuse (stmt) + /* || gimple_vuse (stmt) */ /* We need to rewrite stmts with undefined overflow to use unsigned arithmetic but cannot do so for signed division. */ || ((ass = dyn_cast (stmt))
[gcc/aoliva/heads/testme] relax ifcombine to accept vuses
The branch 'aoliva/heads/testme' was updated to point to: 70860fa4cab8... relax ifcombine to accept vuses It previously pointed to: d82778e46cec... relax ifcombine to accept vuses Diff: !!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST): --- d82778e... relax ifcombine to accept vuses Summary of changes (added commits): --- 70860fa... relax ifcombine to accept vuses
[gcc(refs/users/aoliva/heads/testme)] relax ifcombine to accept vuses
https://gcc.gnu.org/g:d82778e46cecd7879b647471b126fa156e6672f2 commit d82778e46cecd7879b647471b126fa156e6672f2 Author: Alexandre Oliva Date: Fri Sep 13 02:13:50 2024 -0300 relax ifcombine to accept vuses Diff: --- gcc/tree-ssa-ifcombine.cc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc index 61480e5fa894..7678c87e0170 100644 --- a/gcc/tree-ssa-ifcombine.cc +++ b/gcc/tree-ssa-ifcombine.cc @@ -129,7 +129,7 @@ bb_no_side_effects_p (basic_block bb) enum tree_code rhs_code; if (gimple_has_side_effects (stmt) || gimple_could_trap_p (stmt) - || gimple_vuse (stmt) + /* || gimple_vuse (stmt) */ /* We need to rewrite stmts with undefined overflow to use unsigned arithmetic but cannot do so for signed division. */ || ((ass = dyn_cast (stmt))
[gcc(refs/users/aoliva/heads/testme)] fold truth-and only in ifcombine
https://gcc.gnu.org/g:1494b67efaa0c4c3ebd46b7fcaee5a3389124d4b commit 1494b67efaa0c4c3ebd46b7fcaee5a3389124d4b Author: Alexandre Oliva Date: Fri Aug 18 00:51:23 2023 -0300 fold truth-and only in ifcombine Diff: --- gcc/gimple-fold.cc| 2 ++ gcc/tree-ssa-ifcombine.cc | 24 +--- 2 files changed, 23 insertions(+), 3 deletions(-) diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc index 85a0ec028030..5b7d83edbea9 100644 --- a/gcc/gimple-fold.cc +++ b/gcc/gimple-fold.cc @@ -8738,12 +8738,14 @@ maybe_fold_and_comparisons (tree type, op2b, outer_cond_bb)) return t; +#if 0 if (tree t = fold_truth_andor_maybe_separate (UNKNOWN_LOCATION, TRUTH_ANDIF_EXPR, type, code2, op2a, op2b, code1, op1a, op1b, NULL)) return t; +#endif return NULL_TREE; } diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc index 79a4bdd363b9..61480e5fa894 100644 --- a/gcc/tree-ssa-ifcombine.cc +++ b/gcc/tree-ssa-ifcombine.cc @@ -399,6 +399,14 @@ update_profile_after_ifcombine (basic_block inner_cond_bb, outer2->probability = profile_probability::never (); } +/* FIXME: move to a header file. */ +extern tree +fold_truth_andor_maybe_separate (location_t loc, +enum tree_code code, tree truth_type, +enum tree_code lcode, tree ll_arg, tree lr_arg, +enum tree_code rcode, tree rl_arg, tree rr_arg, +tree *separatep); + /* If-convert on a and pattern with a common else block. The inner if is specified by its INNER_COND_BB, the outer by OUTER_COND_BB. inner_inv, outer_inv and result_inv indicate whether the conditions @@ -576,7 +584,7 @@ ifcombine_ifandif (basic_block inner_cond_bb, bool inner_inv, else if (TREE_CODE_CLASS (gimple_cond_code (inner_cond)) == tcc_comparison && TREE_CODE_CLASS (gimple_cond_code (outer_cond)) == tcc_comparison) { - tree t; + tree t, ts = NULL_TREE; enum tree_code inner_cond_code = gimple_cond_code (inner_cond); enum tree_code outer_cond_code = gimple_cond_code (outer_cond); @@ -599,7 +607,17 @@ ifcombine_ifandif (basic_block inner_cond_bb, bool inner_inv, outer_cond_code, gimple_cond_lhs (outer_cond), gimple_cond_rhs (outer_cond), - gimple_bb (outer_cond + gimple_bb (outer_cond))) + && !(t = ts = (fold_truth_andor_maybe_separate +(UNKNOWN_LOCATION, TRUTH_ANDIF_EXPR, + boolean_type_node, + outer_cond_code, + gimple_cond_lhs (outer_cond), + gimple_cond_rhs (outer_cond), + inner_cond_code, + gimple_cond_lhs (inner_cond), + gimple_cond_rhs (inner_cond), + NULL { { tree t1, t2; @@ -636,7 +654,7 @@ ifcombine_ifandif (basic_block inner_cond_bb, bool inner_inv, NULL, true, GSI_SAME_STMT); } /* ??? Fold should avoid this. */ - else if (!is_gimple_condexpr_for_cond (t)) + else if (ts && !is_gimple_condexpr_for_cond (t)) goto gimplify_after_fold; if (result_inv) t = fold_build1 (TRUTH_NOT_EXPR, TREE_TYPE (t), t);
[gcc(refs/users/aoliva/heads/testme)] check for mergeable loads, choose insertion points accordingly
https://gcc.gnu.org/g:163a7691962e2a60402d2b75fb2243bfd33b3595 commit 163a7691962e2a60402d2b75fb2243bfd33b3595 Author: Alexandre Oliva Date: Thu Jul 27 05:15:20 2023 -0300 check for mergeable loads, choose insertion points accordingly Diff: --- gcc/gimple-fold.cc | 253 ++--- 1 file changed, 219 insertions(+), 34 deletions(-) diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc index 64426bd76977..85a0ec028030 100644 --- a/gcc/gimple-fold.cc +++ b/gcc/gimple-fold.cc @@ -69,6 +69,7 @@ along with GCC; see the file COPYING3. If not see #include "varasm.h" #include "internal-fn.h" #include "gimple-range.h" +#include "tree-ssa-loop-niter.h" // stmt_dominates_stmt_p /* ??? Move this to some header, it's defined in fold-const.c. */ extern tree @@ -7395,7 +7396,7 @@ maybe_fold_comparisons_from_match_pd (tree type, enum tree_code code, Same as ssa_is_replaceable_p, except that we don't insist it has a single use. */ -bool +static bool ssa_is_substitutable_p (gimple *stmt) { #if 0 @@ -7476,9 +7477,10 @@ is_cast_p (tree *name) if (gimple_num_ops (def) != 2) break; - if (get_gimple_rhs_class (gimple_expr_code (def)) - == GIMPLE_SINGLE_RHS) + if (gimple_assign_single_p (def)) { + if (gimple_assign_load_p (def)) + break; *name = gimple_assign_rhs1 (def); continue; } @@ -7515,8 +7517,7 @@ is_binop_p (enum tree_code code, tree *name) return 0; case 2: - if (get_gimple_rhs_class (gimple_expr_code (def)) - == GIMPLE_SINGLE_RHS) + if (gimple_assign_single_p (def) && !gimple_assign_load_p (def)) { *name = gimple_assign_rhs1 (def); continue; @@ -7524,7 +7525,7 @@ is_binop_p (enum tree_code code, tree *name) return 0; case 3: - ; + break; } if (gimple_assign_rhs_code (def) != code) @@ -7569,6 +7570,26 @@ prepare_xor (tree l_arg, tree *r_arg) return ret; } +/* If EXP is a SSA_NAME whose DEF is a load stmt, set *LOAD to it and + return its RHS, otherwise return EXP. */ + +static tree +follow_load (tree exp, gimple **load) +{ + if (TREE_CODE (exp) == SSA_NAME + && !SSA_NAME_IS_DEFAULT_DEF (exp)) +{ + gimple *def = SSA_NAME_DEF_STMT (exp); + if (gimple_assign_load_p (def)) + { + *load = def; + exp = gimple_assign_rhs1 (def); + } +} + + return exp; +} + /* Subroutine for fold_truth_andor_1: decode a field reference. If EXP is a comparison reference, we return the innermost reference. @@ -7595,6 +7616,9 @@ prepare_xor (tree l_arg, tree *r_arg) BIT_XOR_EXPR compared with zero. We're to take the first or second operand thereof if so. It should be zero otherwise. + *LOAD is set to the load stmt of the innermost reference, if any, + *and NULL otherwise. + Return 0 if this is not a component reference or is one that we can't do anything with. */ @@ -7602,7 +7626,8 @@ static tree decode_field_reference (location_t loc, tree *exp_, HOST_WIDE_INT *pbitsize, HOST_WIDE_INT *pbitpos, machine_mode *pmode, int *punsignedp, int *preversep, int *pvolatilep, - tree *pmask, tree *pand_mask, int xor_which) + tree *pmask, tree *pand_mask, int xor_which, + gimple **load) { tree exp = *exp_; tree outer_type = 0; @@ -7612,11 +7637,13 @@ decode_field_reference (location_t loc, tree *exp_, HOST_WIDE_INT *pbitsize, unsigned int precision; HOST_WIDE_INT shiftrt = 0; + *load = NULL; + /* All the optimizations using this function assume integer fields. There are problems with FP fields since the type_for_size call below can fail for, e.g., XFmode. */ if (! INTEGRAL_TYPE_P (TREE_TYPE (exp))) -return 0; +return NULL_TREE; /* We are interested in the bare arrangement of bits, so strip everything that doesn't affect the machine mode. However, record the type of the @@ -7626,7 +7653,7 @@ decode_field_reference (location_t loc, tree *exp_, HOST_WIDE_INT *pbitsize, if ((and_mask = is_binop_p (BIT_AND_EXPR, &exp))) { if (TREE_CODE (and_mask) != INTEGER_CST) - return 0; + return NULL_TREE; } if (xor_which) @@ -7644,16 +7671,18 @@ decode_field_reference (location_t loc, tree *exp_, HOST_WIDE_INT *pbitsize, if (tree shift = is_binop_p (RSHIFT_EXPR, &exp)) { if (TREE_CODE (shift) != INTEGER_CST || !tree_fits_shwi_p (shift)) - return 0; + return NULL_TREE; shiftrt = tree_to_shwi (shift); if (shiftrt <= 0) - return 0; + return NULL_TREE; } if (tree t = is_cast_p (&exp)) if (!outer_type) outer_type = t; + exp = follow_load (exp, load); + poly_int64 poly_bitsize, poly_bitpos; inner = ge
[gcc(refs/users/aoliva/heads/testme)] rework truth_andor folding into tree-ssa-ifcombine
https://gcc.gnu.org/g:e4f6196e7a16de0ceb4b9f4b68993a1f8454a7fc commit e4f6196e7a16de0ceb4b9f4b68993a1f8454a7fc Author: Alexandre Oliva Date: Tue Sep 29 12:55:20 2020 -0300 rework truth_andor folding into tree-ssa-ifcombine Diff: --- gcc/fold-const.cc | 1048 + gcc/gimple-fold.cc| 1149 + gcc/tree-ssa-ifcombine.cc |7 +- 3 files changed, 1170 insertions(+), 1034 deletions(-) diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc index 81814de5b04b..19824e6a477f 100644 --- a/gcc/fold-const.cc +++ b/gcc/fold-const.cc @@ -137,7 +137,6 @@ static tree range_successor (tree); static tree fold_range_test (location_t, enum tree_code, tree, tree, tree); static tree fold_cond_expr_with_comparison (location_t, tree, enum tree_code, tree, tree, tree, tree); -static tree unextend (tree, int, int, tree); static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *); static tree extract_muldiv_1 (tree, tree, enum tree_code, tree, bool *); static tree fold_binary_op_with_conditional_arg (location_t, @@ -4695,7 +4694,7 @@ invert_truthvalue_loc (location_t loc, tree arg) is the original memory reference used to preserve the alias set of the access. */ -static tree +tree make_bit_field_ref (location_t loc, tree inner, tree orig_inner, tree type, HOST_WIDE_INT bitsize, poly_int64 bitpos, int unsignedp, int reversep) @@ -4945,212 +4944,6 @@ optimize_bit_field_compare (location_t loc, enum tree_code code, return lhs; } -/* If *R_ARG is a constant zero, and L_ARG is a possibly masked - BIT_XOR_EXPR, return 1 and set *r_arg to l_arg. - Otherwise, return 0. - - The returned value should be passed to decode_field_reference for it - to handle l_arg, and then doubled for r_arg. */ -static int -prepare_xor (tree l_arg, tree *r_arg) -{ - int ret = 0; - - if (!integer_zerop (*r_arg)) -return ret; - - tree exp = l_arg; - STRIP_NOPS (exp); - - if (TREE_CODE (exp) == BIT_AND_EXPR) -{ - tree and_mask = TREE_OPERAND (exp, 1); - exp = TREE_OPERAND (exp, 0); - STRIP_NOPS (exp); STRIP_NOPS (and_mask); - if (TREE_CODE (and_mask) != INTEGER_CST) - return ret; -} - - if (TREE_CODE (exp) == BIT_XOR_EXPR) -{ - *r_arg = l_arg; - return 1; -} - - return ret; -} - -/* Subroutine for fold_truth_andor_1: decode a field reference. - - If EXP is a comparison reference, we return the innermost reference. - - *PBITSIZE is set to the number of bits in the reference, *PBITPOS is - set to the starting bit number. - - If the innermost field can be completely contained in a mode-sized - unit, *PMODE is set to that mode. Otherwise, it is set to VOIDmode. - - *PVOLATILEP is set to 1 if the any expression encountered is volatile; - otherwise it is not changed. - - *PUNSIGNEDP is set to the signedness of the field. - - *PREVERSEP is set to the storage order of the field. - - *PMASK is set to the mask used. This is either contained in a - BIT_AND_EXPR or derived from the width of the field. - - *PAND_MASK is set to the mask found in a BIT_AND_EXPR, if any. - - XOR_WHICH is 1 or 2 if EXP was found to be a (possibly masked) - BIT_XOR_EXPR compared with zero. We're to take the first or second - operand thereof if so. It should be zero otherwise. - - Return 0 if this is not a component reference or is one that we can't - do anything with. */ - -static tree -decode_field_reference (location_t loc, tree *exp_, HOST_WIDE_INT *pbitsize, - HOST_WIDE_INT *pbitpos, machine_mode *pmode, - int *punsignedp, int *preversep, int *pvolatilep, - tree *pmask, tree *pand_mask, int xor_which) -{ - tree exp = *exp_; - tree outer_type = 0; - tree and_mask = 0; - tree mask, inner, offset; - tree unsigned_type; - unsigned int precision; - HOST_WIDE_INT shiftrt = 0; - - /* All the optimizations using this function assume integer fields. - There are problems with FP fields since the type_for_size call - below can fail for, e.g., XFmode. */ - if (! INTEGRAL_TYPE_P (TREE_TYPE (exp))) -return NULL_TREE; - - /* We are interested in the bare arrangement of bits, so strip everything - that doesn't affect the machine mode. However, record the type of the - outermost expression if it may matter below. */ - if (CONVERT_EXPR_P (exp) - || TREE_CODE (exp) == NON_LVALUE_EXPR) -outer_type = TREE_TYPE (exp); - STRIP_NOPS (exp); - - if (TREE_CODE (exp) == BIT_AND_EXPR) -{ - and_mask = TREE_OPERAND (exp, 1); - exp = TREE_OPERAND (exp, 0); - STRIP_NOPS (exp); STRIP_NOPS (and_mask); - if (TREE_CODE (and_mask) != INTEGER_CST) - return NULL_TREE; -} - - if (xor_which) -{ - gcc_checking_assert (TREE_CODE (exp) == BIT_XOR_EXPR); -
[gcc(refs/users/aoliva/heads/testme)] assorted improvements for fold_truth_andor_1
https://gcc.gnu.org/g:66212571a719e4cfcedb59103eb8fd7dc52e84f8 commit 66212571a719e4cfcedb59103eb8fd7dc52e84f8 Author: Alexandre Oliva Date: Tue Sep 29 04:08:46 2020 -0300 assorted improvements for fold_truth_andor_1 This patch introduces various improvements to the logic that merges field compares. Before the patch, we could merge: (a.x1 EQNE b.x1) ANDOR (a.y1 EQNE b.y1) into something like: (((type *)&a)[Na] & MASK) EQNE (((type *)&b)[Nb] & MASK) if both of A's fields live within the same alignment boundaries, and so do B's, at the same relative positions. Constants may be used instead of the object B. The initial goal of this patch was to enable such combinations when a field crossed alignment boundaries, e.g. for packed types. We can't generally access such fields with a single memory access, so when we come across such a compare, we will attempt to combine each access separately. Some merging opportunities were missed because of right-shifts, compares expressed as e.g. ((a.x1 ^ b.x1) & MASK) EQNE 0, and narrowing conversions, especially after earlier merges. This patch introduces handlers for several cases involving these. Other merging opportunities were missed because of association. The existing logic would only succeed in merging a pair of consecutive compares, or e.g. B with C in (A ANDOR B) ANDOR C, not even trying e.g. C and D in (A ANDOR (B ANDOR C)) ANDOR D. I've generalized the handling of the rightmost compare in the left-hand operand, going for the leftmost compare in the right-hand operand, and then onto trying to merge compares pairwise, one from each operand, even if they are not consecutive, taking care to avoid merging operations with intervening side effects, including volatile accesses. When it is the second of a non-consecutive pair of compares that first accesses a word, we may merge the first compare with part of the second compare that refers to the same word, keeping the compare of the remaining bits at the spot where the second compare used to be. Handling compares with non-constant fields was somewhat generalized, now handling non-adjacent fields. When a field of one object crosses an alignment boundary but the other doesn't, we issue the same load in both compares; gimple optimizers will later turn it into a single load, without our having to handle SAVE_EXPRs at this point. The logic for issuing split loads and compares, and ordering them, is now shared between all cases of compares with constants and with another object. The -Wno-error for toplev.o on rs6000 is because of toplev.c's: if ((flag_sanitize & SANITIZE_ADDRESS) && !FRAME_GROWS_DOWNWARD) and rs6000.h's: #define FRAME_GROWS_DOWNWARD (flag_stack_protect != 0 \ || (flag_sanitize & SANITIZE_ADDRESS) != 0) The mutually exclusive conditions involving flag_sanitize are now noticed and reported by fold-const.c's: warning (0, "% of mutually exclusive equal-tests" " is always 0"); This patch enables over 12k compare-merging opportunities that we used to miss in a GCC bootstrap. for gcc/ChangeLog * fold-const.cc (prepare_xor): New. (decode_field_reference): Handle xor, shift, and narrowing conversions. (all_ones_mask_p): Remove. (compute_split_boundary_from_align): New. (build_split_load, reuse_split_load): New. (fold_truth_andor_1): Add recursion to combine pairs of non-neighboring compares. Handle xor compared with zero. Handle fields straddling across alignment boundaries. Generalize handling of non-constant rhs. (fold_truth_andor): Leave sub-expression handling to the recursion above. * config/rs6000/t-rs6000 (toplev.o-warn): Disable errors. for gcc/testsuite/ChangeLog * gcc.dg/field-merge-1.c: New. * gcc.dg/field-merge-2.c: New. * gcc.dg/field-merge-3.c: New. * gcc.dg/field-merge-4.c: New. * gcc.dg/field-merge-5.c: New. Diff: --- gcc/config/rs6000/t-rs6000 | 4 + gcc/fold-const.cc| 818 --- gcc/testsuite/gcc.dg/field-merge-1.c | 64 +++ gcc/testsuite/gcc.dg/field-merge-2.c | 31 ++ gcc/testsuite/gcc.dg/field-merge-3.c | 36 ++ gcc/testsuite/gcc.dg/field-merge-4.c | 40 ++ gcc/testsuite/gcc.dg/field-merge-5.c | 40 ++ 7 files changed, 881 insertions(+), 152 deletions(-) diff --git a/gcc/config/rs6000/t-rs6000 b/gcc/config/rs6000/t-rs6000 index 155788de40a3..a83968d663a6
[gcc(refs/users/aoliva/heads/testme)] testsuite: a few more hostedlib adjustments
https://gcc.gnu.org/g:90eb457e05f73d5bd81beb3419b5803bb68491b1 commit 90eb457e05f73d5bd81beb3419b5803bb68491b1 Author: Alexandre Oliva Date: Thu Sep 12 20:03:53 2024 -0300 testsuite: a few more hostedlib adjustments This adjusts some recently-added tests that won't compile without a hostedlib libstdc++, missed in the patch that just went in, and also an old test that I'd missed because it also failed in my baseline. for gcc/testsuite/ChangeLog * g++.dg/coroutines/pr108620.C: Skip if !hostedlib because of unavailable headers. * g++.dg/other/profile1.C: Likewise. * g+.dg/ext/pragma-unroll-lambda-lto.C: Skip if !hostedlib because of unavailable declarations. Diff: --- gcc/testsuite/g++.dg/coroutines/pr108620.C | 2 ++ gcc/testsuite/g++.dg/ext/pragma-unroll-lambda-lto.C | 1 + gcc/testsuite/g++.dg/other/profile1.C | 1 + 3 files changed, 4 insertions(+) diff --git a/gcc/testsuite/g++.dg/coroutines/pr108620.C b/gcc/testsuite/g++.dg/coroutines/pr108620.C index e8016b9f8a23..22bf0c18bac4 100644 --- a/gcc/testsuite/g++.dg/coroutines/pr108620.C +++ b/gcc/testsuite/g++.dg/coroutines/pr108620.C @@ -1,3 +1,5 @@ +// { dg-skip-if "requires hosted libstdc++ for iostream" { ! hostedlib } } + // https://gcc.gnu.org/PR108620 #include #include diff --git a/gcc/testsuite/g++.dg/ext/pragma-unroll-lambda-lto.C b/gcc/testsuite/g++.dg/ext/pragma-unroll-lambda-lto.C index 144c4c326924..64cdf90f34d3 100644 --- a/gcc/testsuite/g++.dg/ext/pragma-unroll-lambda-lto.C +++ b/gcc/testsuite/g++.dg/ext/pragma-unroll-lambda-lto.C @@ -1,5 +1,6 @@ // { dg-do link { target c++11 } } // { dg-options "-O2 -flto -fdump-rtl-loop2_unroll" } +// { dg-skip-if "requires hosted libstdc++ for cstdlib rand" { ! hostedlib } } #include diff --git a/gcc/testsuite/g++.dg/other/profile1.C b/gcc/testsuite/g++.dg/other/profile1.C index a4bf6b3d0fea..99844373189e 100644 --- a/gcc/testsuite/g++.dg/other/profile1.C +++ b/gcc/testsuite/g++.dg/other/profile1.C @@ -2,6 +2,7 @@ // { dg-do run } // { dg-require-profiling "" } // { dg-options "-fnon-call-exceptions -fprofile-arcs" } +// { dg-skip-if "requires hosted libstdc++ for string" { ! hostedlib } } #include
[gcc/aoliva/heads/testme] (2 commits) assorted improvements for fold_truth_andor_1
The branch 'aoliva/heads/testme' was updated to point to: 66212571a719... assorted improvements for fold_truth_andor_1 It previously pointed to: 6af7fe931d94... testsuite: a few more hostedlib adjustments Diff: !!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST): --- 6af7fe9... testsuite: a few more hostedlib adjustments 5c5b83b... assorted improvements for fold_truth_andor_1 Summary of changes (added commits): --- 6621257... assorted improvements for fold_truth_andor_1 90eb457... testsuite: a few more hostedlib adjustments
[gcc(refs/users/aoliva/heads/testme)] testsuite: a few more hostedlib adjustments
https://gcc.gnu.org/g:6af7fe931d949dbf453fed41d2198abe2abd766c commit 6af7fe931d949dbf453fed41d2198abe2abd766c Author: Alexandre Oliva Date: Thu Sep 12 20:03:53 2024 -0300 testsuite: a few more hostedlib adjustments This adjusts some recently-added tests that won't compile without a hostedlib libstdc++, missed in the patch that just went in, and also an old test that I'd missed because it also failed in my baseline. for gcc/testsuite/ChangeLog * g++.dg/coroutines/pr108620.C: Skip if !hostedlib because of unavailable headers. * g++.dg/other/profile1.C: Likewise. * g+.dg/ext/pragma-unroll-lambda-lto.C: Skip if !hostedlib because of unavailable declarations. Diff: --- gcc/testsuite/g++.dg/coroutines/pr108620.C | 2 ++ gcc/testsuite/g++.dg/ext/pragma-unroll-lambda-lto.C | 1 + gcc/testsuite/g++.dg/other/profile1.C | 1 + 3 files changed, 4 insertions(+) diff --git a/gcc/testsuite/g++.dg/coroutines/pr108620.C b/gcc/testsuite/g++.dg/coroutines/pr108620.C index e8016b9f8a23..22bf0c18bac4 100644 --- a/gcc/testsuite/g++.dg/coroutines/pr108620.C +++ b/gcc/testsuite/g++.dg/coroutines/pr108620.C @@ -1,3 +1,5 @@ +// { dg-skip-if "requires hosted libstdc++ for iostream" { ! hostedlib } } + // https://gcc.gnu.org/PR108620 #include #include diff --git a/gcc/testsuite/g++.dg/ext/pragma-unroll-lambda-lto.C b/gcc/testsuite/g++.dg/ext/pragma-unroll-lambda-lto.C index 144c4c326924..64cdf90f34d3 100644 --- a/gcc/testsuite/g++.dg/ext/pragma-unroll-lambda-lto.C +++ b/gcc/testsuite/g++.dg/ext/pragma-unroll-lambda-lto.C @@ -1,5 +1,6 @@ // { dg-do link { target c++11 } } // { dg-options "-O2 -flto -fdump-rtl-loop2_unroll" } +// { dg-skip-if "requires hosted libstdc++ for cstdlib rand" { ! hostedlib } } #include diff --git a/gcc/testsuite/g++.dg/other/profile1.C b/gcc/testsuite/g++.dg/other/profile1.C index a4bf6b3d0fea..99844373189e 100644 --- a/gcc/testsuite/g++.dg/other/profile1.C +++ b/gcc/testsuite/g++.dg/other/profile1.C @@ -2,6 +2,7 @@ // { dg-do run } // { dg-require-profiling "" } // { dg-options "-fnon-call-exceptions -fprofile-arcs" } +// { dg-skip-if "requires hosted libstdc++ for string" { ! hostedlib } } #include
[gcc(refs/users/aoliva/heads/testme)] assorted improvements for fold_truth_andor_1
https://gcc.gnu.org/g:5c5b83b5ad497638c9bf250d2ea33d67d758cc77 commit 5c5b83b5ad497638c9bf250d2ea33d67d758cc77 Author: Alexandre Oliva Date: Tue Sep 29 04:08:46 2020 -0300 assorted improvements for fold_truth_andor_1 This patch introduces various improvements to the logic that merges field compares. Before the patch, we could merge: (a.x1 EQNE b.x1) ANDOR (a.y1 EQNE b.y1) into something like: (((type *)&a)[Na] & MASK) EQNE (((type *)&b)[Nb] & MASK) if both of A's fields live within the same alignment boundaries, and so do B's, at the same relative positions. Constants may be used instead of the object B. The initial goal of this patch was to enable such combinations when a field crossed alignment boundaries, e.g. for packed types. We can't generally access such fields with a single memory access, so when we come across such a compare, we will attempt to combine each access separately. Some merging opportunities were missed because of right-shifts, compares expressed as e.g. ((a.x1 ^ b.x1) & MASK) EQNE 0, and narrowing conversions, especially after earlier merges. This patch introduces handlers for several cases involving these. Other merging opportunities were missed because of association. The existing logic would only succeed in merging a pair of consecutive compares, or e.g. B with C in (A ANDOR B) ANDOR C, not even trying e.g. C and D in (A ANDOR (B ANDOR C)) ANDOR D. I've generalized the handling of the rightmost compare in the left-hand operand, going for the leftmost compare in the right-hand operand, and then onto trying to merge compares pairwise, one from each operand, even if they are not consecutive, taking care to avoid merging operations with intervening side effects, including volatile accesses. When it is the second of a non-consecutive pair of compares that first accesses a word, we may merge the first compare with part of the second compare that refers to the same word, keeping the compare of the remaining bits at the spot where the second compare used to be. Handling compares with non-constant fields was somewhat generalized, now handling non-adjacent fields. When a field of one object crosses an alignment boundary but the other doesn't, we issue the same load in both compares; gimple optimizers will later turn it into a single load, without our having to handle SAVE_EXPRs at this point. The logic for issuing split loads and compares, and ordering them, is now shared between all cases of compares with constants and with another object. The -Wno-error for toplev.o on rs6000 is because of toplev.c's: if ((flag_sanitize & SANITIZE_ADDRESS) && !FRAME_GROWS_DOWNWARD) and rs6000.h's: #define FRAME_GROWS_DOWNWARD (flag_stack_protect != 0 \ || (flag_sanitize & SANITIZE_ADDRESS) != 0) The mutually exclusive conditions involving flag_sanitize are now noticed and reported by fold-const.c's: warning (0, "% of mutually exclusive equal-tests" " is always 0"); This patch enables over 12k compare-merging opportunities that we used to miss in a GCC bootstrap. for gcc/ChangeLog * fold-const.cc (prepare_xor): New. (decode_field_reference): Handle xor, shift, and narrowing conversions. (all_ones_mask_p): Remove. (compute_split_boundary_from_align): New. (build_split_load, reuse_split_load): New. (fold_truth_andor_1): Add recursion to combine pairs of non-neighboring compares. Handle xor compared with zero. Handle fields straddling across alignment boundaries. Generalize handling of non-constant rhs. (fold_truth_andor): Leave sub-expression handling to the recursion above. * config/rs6000/t-rs6000 (toplev.o-warn): Disable errors. for gcc/testsuite/ChangeLog * gcc.dg/field-merge-1.c: New. * gcc.dg/field-merge-2.c: New. * gcc.dg/field-merge-3.c: New. * gcc.dg/field-merge-4.c: New. * gcc.dg/field-merge-5.c: New. Diff: --- gcc/config/rs6000/t-rs6000 | 4 + gcc/fold-const.cc| 818 --- gcc/testsuite/gcc.dg/field-merge-1.c | 64 +++ gcc/testsuite/gcc.dg/field-merge-2.c | 31 ++ gcc/testsuite/gcc.dg/field-merge-3.c | 36 ++ gcc/testsuite/gcc.dg/field-merge-4.c | 40 ++ gcc/testsuite/gcc.dg/field-merge-5.c | 40 ++ 7 files changed, 881 insertions(+), 152 deletions(-) diff --git a/gcc/config/rs6000/t-rs6000 b/gcc/config/rs6000/t-rs6000 index 155788de40a3..a83968d663a6
[gcc/aoliva/heads/testme] testsuite: introduce hostedlib effective target
The branch 'aoliva/heads/testme' was updated to point to: 38cc6007421e... testsuite: introduce hostedlib effective target It previously pointed to: fb5b60dab71e... testsuite: introduce hostedlib effective target Diff: !!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST): --- fb5b60d... testsuite: introduce hostedlib effective target Summary of changes (added commits): --- 38cc600... testsuite: introduce hostedlib effective target
[gcc/aoliva/heads/testme] testsuite: introduce hostedlib effective target
The branch 'aoliva/heads/testme' was updated to point to: fb5b60dab71e... testsuite: introduce hostedlib effective target It previously pointed to: f65e763abf41... testsuite: introduce hostedlib effective target Diff: !!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST): --- f65e763... testsuite: introduce hostedlib effective target Summary of changes (added commits): --- fb5b60d... testsuite: introduce hostedlib effective target
[gcc/aoliva/heads/testme] (513 commits) testsuite: introduce hostedlib effective target
The branch 'aoliva/heads/testme' was updated to point to: f65e763abf41... testsuite: introduce hostedlib effective target It previously pointed to: 6af8ecc3949a... testsuite: introduce hostedlib effective target Diff: !!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST): --- 6af8ecc... testsuite: introduce hostedlib effective target Summary of changes (added commits): --- f65e763... testsuite: introduce hostedlib effective target d4d7c4e... Update gcc uk.po (*) aedf6f8... Daily bump. (*) fe66863... c++: vtable referring to "unavailable" virtual fn [PR116606 (*) 6abedee... ipa: Don't disable function parameter analysis for fat LTO (*) 2490951... libsanitizer: On aarch64 use hint #34 in prologue of libsan (*) 73afc3e... lower-bitint: Fix up __builtin_{add,sub}_overflow{,_p} biti (*) 6fb41c2... Don't call clean_symbol_name in create_tmp_var_name [PR1162 (*) 040b979... testsuite: remove -fwrapv from signbit-5.c (*) 0c4a95e... Daily bump. (*) a054ba5... libstdc++: Fix error handling in fs::hard_link_count for Wi (*) 35c9814... libstdc++: Fix overwriting files with fs::copy_file on Wind (*) ec1bcd1... libstdc++: Fix fs::hard_link_count behaviour on MinGW [PR11 (*) ee37d75... libstdc++: Specialize std::disable_sized_sentinel_for for s (*) 4696026... libstdc++: Add missing feature-test macro in various header (*) 3b8a67b... libstdc++: Fix std::variant to reject array types [PR116381 (*) 9899be7... aarch64: Fix ls64 intrinsic availability (*) 8485606... aarch64: Fix memtag intrinsic availability (*) 0a3a0d4... aarch64: Fix tme intrinsic availability (*) 422c3f1... aarch64: Move check_required_extensions (*) c6e04d1... aarch64: Refactor check_required_extensions (*) 0562522... lto: Don't check obj.found for offload section (*) 66eb7b7... Update LDPT_REGISTER_CLAIM_FILE_HOOK_V2 linker plugin hook (*) 5c1955f... Daily bump. (*) 59157c0... i386: Fix vfpclassph non-optimizied intrin (*) 2ac3806... Daily bump. (*) ba9a3f1... Check avx upper register for parallel. (*) db4d810... Daily bump. (*) a2e32a8... Daily bump. (*) 657bf4a... Daily bump. (*) 5999dd8... Fortran: fix ICE with use with rename of namelist member [P (*) 552c7c1... Daily bump. (*) d3c14d4... Daily bump. (*) 4d6c0c0... Add gcc ka.po (*) f5b3dae... i386: testsuite: Adapt fentryname3.c for r14-811 change [PR (*) 377c3e9... i386: testsuite: Add -no-pie for pr113689-1.c [PR70150] (*) 87aea23... Daily bump. (*) 90b1232... Update gcc zh_CN.po (*) 75892d9... MIPS: Include missing mips16.S in libgcc/lib1funcs.S (*) ef9c53b... Daily bump. (*) b414466... Daily bump. (*) 5b75e1c... Daily bump. (*) 8de3153... Daily bump. (*) 27dc153... Align ix86_{move_max,store_max} with vectorizer. (*) ffd458d... Daily bump. (*) 5146af5... Daily bump. (*) 25812d8... [testsuite] [arm] [vect] adjust mve-vshr test [PR113281] (*) 76ac167... Daily bump. (*) 52da858... c++: fix ICE in convert_nontype_argument [PR116384] (*) af97b5e... testsuite: Prune warning about size of enums (*) 1fad6ad... Daily bump. (*) c725748... AVR: ad target/116407 - Fix linker error "relocation trunca (*) 919c42b... AVR: target/116407 - Fix linker error "relocation truncated (*) f4ce098... Daily bump. (*) f3d9c12... AVR: target/116390 - Fix an avrtiny asm out template. (*) 0296001... Daily bump. (*) 507b4e1... AVR: target/85624 - Use HImode for clrmemqi alignment. (*) edf95a4... testsuite: Verify -fshort-enums and -fno-short-enums in pr3 (*) 5c1f687... testsuite: Add -fno-short-enums to pr97315-1.C (*) 345d145... testsuite: Add -fwrapv to signbit-5.c (*) 45a771d... i386: Fix some vex insns that prohibit egpr (*) 86dacfb... aarch64: Add another use of force_subreg [PR115464] (*) 32b2129... aarch64: Fix invalid nested subregs [PR115464] (*) 4e7735a... Move ix86_align_loops into a separate pass and insert the p (*) ccca8df... Daily bump. (*) 63c51e0... c++/coroutines: fix passing *this to promise type, again [P (*) d9bd361... [PATCH] RISC-V: Fix unresolved mcpu-[67].c tests (*) 8c98f06... RISC-V: Make full-vec-move1.c test robust for optimization (*) 7268985... Daily bump. (*) e903ada... s390: Fix high-level builtins vec_gfmsum{,_accum}_128 (*) 5a63e19... Daily bump. (*) 7d9bb37... Add -mcpu=power11 support. (*) f688431... Daily bump. (*) 6bfd78c... Daily bump. (*) 534ffe7... Daily bump. (*) 6f1e687... Daily bump. (*) b0dd13e... i386: Fix up __builtin_ia32_b{extr{,i}_u{32,64},zhi_{s,d}i} (*) 897cd79... Daily bump. (*) 9ca1d7a... AVR: target/116295 - Fix unrecognizable insn with __flash r (*) a9255df... Daily bump. (*) 49e8eee... Daily bump. (*) b1102f7... c++: alias and non-type template parm [PR116223] (*) 987fc81... c++: parse error with -std=c++14 -fconcepts [PR116071] (*) ba26c47... hppa: Fix (plus (plus (mult (a) (mem_shadd
[gcc/aoliva/heads/testbase] (512 commits) Update gcc uk.po
The branch 'aoliva/heads/testbase' was updated to point to: d4d7c4e21984... Update gcc uk.po It previously pointed to: af1500dd8c00... [libstdc++-v3] [testsuite] improve future/*/poll.cc calibra Diff: Summary of changes (added commits): --- d4d7c4e... Update gcc uk.po (*) aedf6f8... Daily bump. (*) fe66863... c++: vtable referring to "unavailable" virtual fn [PR116606 (*) 6abedee... ipa: Don't disable function parameter analysis for fat LTO (*) 2490951... libsanitizer: On aarch64 use hint #34 in prologue of libsan (*) 73afc3e... lower-bitint: Fix up __builtin_{add,sub}_overflow{,_p} biti (*) 6fb41c2... Don't call clean_symbol_name in create_tmp_var_name [PR1162 (*) 040b979... testsuite: remove -fwrapv from signbit-5.c (*) 0c4a95e... Daily bump. (*) a054ba5... libstdc++: Fix error handling in fs::hard_link_count for Wi (*) 35c9814... libstdc++: Fix overwriting files with fs::copy_file on Wind (*) ec1bcd1... libstdc++: Fix fs::hard_link_count behaviour on MinGW [PR11 (*) ee37d75... libstdc++: Specialize std::disable_sized_sentinel_for for s (*) 4696026... libstdc++: Add missing feature-test macro in various header (*) 3b8a67b... libstdc++: Fix std::variant to reject array types [PR116381 (*) 9899be7... aarch64: Fix ls64 intrinsic availability (*) 8485606... aarch64: Fix memtag intrinsic availability (*) 0a3a0d4... aarch64: Fix tme intrinsic availability (*) 422c3f1... aarch64: Move check_required_extensions (*) c6e04d1... aarch64: Refactor check_required_extensions (*) 0562522... lto: Don't check obj.found for offload section (*) 66eb7b7... Update LDPT_REGISTER_CLAIM_FILE_HOOK_V2 linker plugin hook (*) 5c1955f... Daily bump. (*) 59157c0... i386: Fix vfpclassph non-optimizied intrin (*) 2ac3806... Daily bump. (*) ba9a3f1... Check avx upper register for parallel. (*) db4d810... Daily bump. (*) a2e32a8... Daily bump. (*) 657bf4a... Daily bump. (*) 5999dd8... Fortran: fix ICE with use with rename of namelist member [P (*) 552c7c1... Daily bump. (*) d3c14d4... Daily bump. (*) 4d6c0c0... Add gcc ka.po (*) f5b3dae... i386: testsuite: Adapt fentryname3.c for r14-811 change [PR (*) 377c3e9... i386: testsuite: Add -no-pie for pr113689-1.c [PR70150] (*) 87aea23... Daily bump. (*) 90b1232... Update gcc zh_CN.po (*) 75892d9... MIPS: Include missing mips16.S in libgcc/lib1funcs.S (*) ef9c53b... Daily bump. (*) b414466... Daily bump. (*) 5b75e1c... Daily bump. (*) 8de3153... Daily bump. (*) 27dc153... Align ix86_{move_max,store_max} with vectorizer. (*) ffd458d... Daily bump. (*) 5146af5... Daily bump. (*) 25812d8... [testsuite] [arm] [vect] adjust mve-vshr test [PR113281] (*) 76ac167... Daily bump. (*) 52da858... c++: fix ICE in convert_nontype_argument [PR116384] (*) af97b5e... testsuite: Prune warning about size of enums (*) 1fad6ad... Daily bump. (*) c725748... AVR: ad target/116407 - Fix linker error "relocation trunca (*) 919c42b... AVR: target/116407 - Fix linker error "relocation truncated (*) f4ce098... Daily bump. (*) f3d9c12... AVR: target/116390 - Fix an avrtiny asm out template. (*) 0296001... Daily bump. (*) 507b4e1... AVR: target/85624 - Use HImode for clrmemqi alignment. (*) edf95a4... testsuite: Verify -fshort-enums and -fno-short-enums in pr3 (*) 5c1f687... testsuite: Add -fno-short-enums to pr97315-1.C (*) 345d145... testsuite: Add -fwrapv to signbit-5.c (*) 45a771d... i386: Fix some vex insns that prohibit egpr (*) 86dacfb... aarch64: Add another use of force_subreg [PR115464] (*) 32b2129... aarch64: Fix invalid nested subregs [PR115464] (*) 4e7735a... Move ix86_align_loops into a separate pass and insert the p (*) ccca8df... Daily bump. (*) 63c51e0... c++/coroutines: fix passing *this to promise type, again [P (*) d9bd361... [PATCH] RISC-V: Fix unresolved mcpu-[67].c tests (*) 8c98f06... RISC-V: Make full-vec-move1.c test robust for optimization (*) 7268985... Daily bump. (*) e903ada... s390: Fix high-level builtins vec_gfmsum{,_accum}_128 (*) 5a63e19... Daily bump. (*) 7d9bb37... Add -mcpu=power11 support. (*) f688431... Daily bump. (*) 6bfd78c... Daily bump. (*) 534ffe7... Daily bump. (*) 6f1e687... Daily bump. (*) b0dd13e... i386: Fix up __builtin_ia32_b{extr{,i}_u{32,64},zhi_{s,d}i} (*) 897cd79... Daily bump. (*) 9ca1d7a... AVR: target/116295 - Fix unrecognizable insn with __flash r (*) a9255df... Daily bump. (*) 49e8eee... Daily bump. (*) b1102f7... c++: alias and non-type template parm [PR116223] (*) 987fc81... c++: parse error with -std=c++14 -fconcepts [PR116071] (*) ba26c47... hppa: Fix (plus (plus (mult (a) (mem_shadd_constant)) (b)) (*) f2b5ca6... wide-int: Fix up mul_internal overflow checking [PR116224] (*) 3fe5720... libquadmath: Fix up libquadmath/math/sqrtq.c compilation in (*) cad2693... fortran: Fix up pasto in gfc_get_array_descr_info (*) ba45573... sh: Don't call make_ins
[gcc/aoliva/heads/testme] testsuite: introduce hostedlib effective target
The branch 'aoliva/heads/testme' was updated to point to: 6af8ecc3949... testsuite: introduce hostedlib effective target It previously pointed to: 97ab3254903... more hostedlib notes Diff: !!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST): --- 97ab325... more hostedlib notes 9abd783... testsuite: introduce hostedlib effective target Summary of changes (added commits): --- 6af8ecc... testsuite: introduce hostedlib effective target
[gcc/aoliva/heads/testme] more hostedlib notes
The branch 'aoliva/heads/testme' was updated to point to: 97ab3254903... more hostedlib notes It previously pointed to: 62adc720792... more hostedlib notes Diff: !!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST): --- 62adc72... more hostedlib notes Summary of changes (added commits): --- 97ab325... more hostedlib notes
[gcc/aoliva/heads/testme] more hostedlib notes
The branch 'aoliva/heads/testme' was updated to point to: 62adc720792b... more hostedlib notes It previously pointed to: 4c3d24ebc5a1... more hostedlib notes Diff: !!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST): --- 4c3d24e... more hostedlib notes Summary of changes (added commits): --- 62adc72... more hostedlib notes
[gcc/aoliva/heads/testme] more hostedlib notes
The branch 'aoliva/heads/testme' was updated to point to: 4c3d24ebc5a1... more hostedlib notes It previously pointed to: 93208984e9f2... more hostedlib notes Diff: !!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST): --- 9320898... more hostedlib notes Summary of changes (added commits): --- 4c3d24e... more hostedlib notes
[gcc/aoliva/heads/testme] more hostedlib notes
The branch 'aoliva/heads/testme' was updated to point to: 93208984e9f2... more hostedlib notes It previously pointed to: 352f61e0d8d1... more hostedlib notes Diff: !!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST): --- 352f61e... more hostedlib notes Summary of changes (added commits): --- 9320898... more hostedlib notes
[gcc/aoliva/heads/testme] more hostedlib notes
The branch 'aoliva/heads/testme' was updated to point to: 352f61e0d8d1... more hostedlib notes It previously pointed to: 07051d45ce80... more hostedlib notes Diff: !!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST): --- 07051d4... more hostedlib notes Summary of changes (added commits): --- 352f61e... more hostedlib notes
[gcc(refs/users/aoliva/heads/testme)] more hostedlib notes
https://gcc.gnu.org/g:07051d45ce803cf70272fcbfce71828598c1d7c8 commit 07051d45ce803cf70272fcbfce71828598c1d7c8 Author: Alexandre Oliva Date: Sat Aug 31 14:54:41 2024 -0300 more hostedlib notes Diff: --- gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c | 1 + gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c | 1 + gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c | 1 + gcc/testsuite/c-c++-common/goacc/kernels-loop-data-2.c | 1 + gcc/testsuite/c-c++-common/goacc/kernels-loop-data-enter-exit-2.c | 1 + gcc/testsuite/c-c++-common/goacc/kernels-loop-data-enter-exit.c | 1 + gcc/testsuite/c-c++-common/goacc/kernels-loop-data-update.c | 1 + gcc/testsuite/c-c++-common/goacc/kernels-loop-data.c| 1 + gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c | 1 + gcc/testsuite/c-c++-common/goacc/kernels-loop-mod-not-zero.c| 1 + gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c | 1 + gcc/testsuite/c-c++-common/goacc/kernels-loop.c | 1 + gcc/testsuite/c-c++-common/goacc/kernels-one-counter-var.c | 1 + .../c-c++-common/goacc/kernels-parallel-loop-data-enter-exit.c | 1 + gcc/testsuite/c-c++-common/gomp/pr103642.c | 1 + gcc/testsuite/c-c++-common/gomp/target-implicit-map-2.c | 2 ++ gcc/testsuite/c-c++-common/simulate-thread/bitfields-4.c| 1 + gcc/testsuite/c-c++-common/tm/malloc.c | 1 + gcc/testsuite/g++.dg/abi/mangle36.C | 1 + gcc/testsuite/g++.dg/abi/mangle40.C | 1 + gcc/testsuite/g++.dg/abi/mangle41.C | 1 + gcc/testsuite/g++.dg/cdce3.C| 1 + gcc/testsuite/g++.dg/contracts/contracts-post7.C| 1 + gcc/testsuite/g++.dg/contracts/pr110159.C | 1 + gcc/testsuite/g++.dg/contracts/pr115434.C | 1 + .../g++.dg/coroutines/coro-bad-gro-00-class-gro-scalar-return.C | 2 ++ .../g++.dg/coroutines/coro-bad-gro-01-void-gro-non-class-coro.C | 2 ++ gcc/testsuite/g++.dg/coroutines/pr110635.C | 1 + gcc/testsuite/g++.dg/coroutines/pr110871.C | 2 ++ gcc/testsuite/g++.dg/coroutines/pr110872.C | 1 + gcc/testsuite/g++.dg/coroutines/symmetric-transfer-00-basic.C | 1 + gcc/testsuite/g++.dg/coroutines/torture/co-yield-03-tmpl-nondependent.C | 1 + gcc/testsuite/g++.dg/cpp/pr80005.C | 1 + gcc/testsuite/g++.dg/cpp0x/lambda/lambda-std-function.C | 1 + gcc/testsuite/g++.dg/cpp0x/lambda/lambda-this8.C| 1 + gcc/testsuite/g++.dg/cpp0x/pr70887.C| 1 + gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-1.C| 2 ++ gcc/testsuite/g++.dg/cpp1y/feat-cxx14.C | 1 + gcc/testsuite/g++.dg/cpp1y/lambda-generic-variadic2.C | 1 + gcc/testsuite/g++.dg/cpp1z/constexpr-asm-1.C| 1 + gcc/testsuite/g++.dg/cpp1z/constexpr-asm-3.C| 1 + gcc/testsuite/g++.dg/cpp1z/feat-cxx1z.C | 1 + gcc/testsuite/g++.dg/cpp23/ext-floating12.C | 1 + gcc/testsuite/g++.dg/cpp23/feat-cxx2b.C | 1 + gcc/testsuite/g++.dg/cpp26/constexpr-new2.C | 1 + gcc/testsuite/g++.dg/cpp26/constexpr-voidptr1.C | 1 + gcc/testsuite/g++.dg/cpp26/feat-cxx26.C | 1 + gcc/testsuite/g++.dg/cpp2a/destroying-delete5.C | 1 + gcc/testsuite/g++.dg/cpp2a/feat-cxx2a.C | 1 + gcc/testsuite/g++.dg/diagnostic/missing-header-pr110164.C | 1 + gcc/testsuite/g++.dg/expr/anew1.C | 2 ++ gcc/testsuite/g++.dg/expr/anew2.C | 2 ++ gcc/testsuite/g++.dg/expr/anew3.C | 2 ++ gcc/testsuite/g++.dg/expr/anew4.C | 2 ++ gcc/testsuite/g++.dg/ext/builtin10.C| 1 + gcc/testsuite/g++.dg/ext/cleanup-10.C | 1 + gcc/testsuite/g++.dg/ext/cleanup-11.C | 1 + gcc/testsuite/g++.dg/ext/cleanup-5.C| 1 + gcc/testsuite/g++.dg/ext/cleanup-8.C| 1 + gcc/testsuite/g++.dg/ext/cleanup-9.C| 1 + gcc/testsuite/g++.dg/ext/is_invocabl
[gcc/aoliva/heads/testme] (60 commits) more hostedlib notes
The branch 'aoliva/heads/testme' was updated to point to: 07051d45ce80... more hostedlib notes It previously pointed to: 62b70aa09f1c... testsuite: introduce hostedlib effective target Diff: !!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST): --- 62b70aa... testsuite: introduce hostedlib effective target Summary of changes (added commits): --- 07051d4... more hostedlib notes 9abd783... testsuite: introduce hostedlib effective target af1500d... [libstdc++-v3] [testsuite] improve future/*/poll.cc calibra (*) 410061b... [libstdc++] [testsuite] avoid async.cc loss of precision [P (*) 9223d17... [testsuite] add linkonly to dg-additional-sources [PR115295 (*) b9bf0c3... amdgcn: Remove TARGET_GCN5_PLUS (*) 023641d... amdgcn: Remove TARGET_GCN3 (*) 57af002... amdgcn: remove gfx803 "Fiji" support (*) 78dc2e2... PR modula2/116557 Remove physical address from the GPL head (*) 4bf758b... libsupc++: Fix handling of m68k extended real in (*) e4d3e7f... testsuite: Rename scanltranstree.exp -> scanltrans.exp (*) 2865719... Rename gimple_asm_input_p to gimple_asm_basic_p (*) a4b6c6a... Rename ASM_INPUT_P to ASM_BASIC_P (*) 5cbfb3a... lto/lto.cc: Fix build with not HAVE_WORKING_FORK (*) 6640a59... lto-wrapper: Honor -save-temps for ltrans' makefile (*) 571d045... ada: Diagnose too large size clause on floating-point type (*) 1c9a6d8... ada: Create usage entry for -gnatw_l (*) 2df253f... ada: Fix standard output stream for gnatcmd output (*) 91f0a3a... ada: Fix minor issues in -gnaty0's documentation (*) a004c28... ada: Documentation for generic type inference (*) 34437eb... ada: Small fixes for FreeBSD (*) cb690aa... ada: Also reset scope for some nested declaration (*) 905ab32... ada: Cleanup expansion of object declarations (*) 78acc6d... ada: Remove repeated guards in validity checks (*) 25d51fb... ranger: Fix up range computation for CLZ [PR116486] (*) 9aaedfc... load and store-lanes with SLP (*) 464067a... lower SLP load permutation to interleaving (*) eca320b... [PATCH] RISC-V: Optimize the cost of the DFmode register mo (*) 0562976... [committed][PR rtl-optimization/116544] Fix test for promot (*) f77435a... i386: Support vec_cmp for V8BF/V16BF/V32BF in AVX10.2 (*) e19f65b... i386: Support vectorized BF16 sqrt with AVX10.2 instruction (*) 29ef601... i386: Support vectorized BF16 smaxmin with AVX10.2 instruct (*) 6d294fb... i386: Support vectorized BF16 FMA with AVX10.2 instructions (*) f82fa0d... i386: Support vectorized BF16 add/sub/mul/div with AVX10.2 (*) 3b1dece... i386: Optimize generate insn for AVX10.2 compare (*) 86f5031... i386: Optimize ordered and nonequal (*) b1f9fbb... i386: Auto vectorize sdot_prod, usdot_prod, udot_prod with (*) 5239902... RISC-V: Add testcases for unsigned scalar quad and oct .SAT (*) ea81e21... RISC-V: Add testcases for unsigned scalar quad and oct .SAT (*) 56ed1df... RISC-V: Add testcases for form 4 of unsigned vector .SAT_AD (*) 72f3e90... RISC-V: Add testcases for form 3 of unsigned vector .SAT_AD (*) e96d4bf... RISC-V: Refactor gen zero_extend rtx for SAT_* when expand (*) 880834d... Daily bump. (*) 592a335... slsr: Use simple_dce_from_worklist in SLSR [PR116554] (*) f22788c... testsuite: Prune compilation messages for modules tests (*) 49fd9b3... Daily bump. (*) bac00c3... i386: Support read-modify-write memory operands in STV. (*) 2ac27bd... libobjc: Add cast to void* to disable warning for casting b (*) df89afb... AVR: Run pass avr-fuse-add a second time after pass_cprop_h (*) 60fc550... AVR: Tidy pass avr-fuse-add. (*) 7f27d1f... testsuite, c++, coroutines: Avoid 'unused' warnings [NFC]. (*) 2c27189... testsuite, c++, coroutines: Correct a test intent. (*) 049a927... c++, coroutines: Make and use a frame access helper. (*) b7e9f36... hppa: Enable PA 2.0 symbolic operands on ELF32 targets (*) ceda727... phiopt: Ignore some nop statements in heursics [PR116098] (*) 457805c... testsuite: Change what is being tested for pr66726-2.c (*) 79b5b50... Fortran: downgrade use associated namelist group name to le (*) afd9558... c++: Add unsequenced C++ testcase (*) dd346b6... c: Add support for unsequenced and reproducible attributes (*) dc476e5... AVR: Don't print a space after , when printing instructions (*) (*) This commit already exists in another branch. Because the reference `refs/users/aoliva/heads/testme' matches your hooks.email-new-commits-only configuration, no separate email is sent for this commit.
[gcc/aoliva/heads/testbase] (58 commits) [libstdc++-v3] [testsuite] improve future/*/poll.cc calibra
The branch 'aoliva/heads/testbase' was updated to point to: af1500dd8c00... [libstdc++-v3] [testsuite] improve future/*/poll.cc calibra It previously pointed to: 673a448aa24e... Optimize initialization of small padded objects Diff: Summary of changes (added commits): --- af1500d... [libstdc++-v3] [testsuite] improve future/*/poll.cc calibra (*) 410061b... [libstdc++] [testsuite] avoid async.cc loss of precision [P (*) 9223d17... [testsuite] add linkonly to dg-additional-sources [PR115295 (*) b9bf0c3... amdgcn: Remove TARGET_GCN5_PLUS (*) 023641d... amdgcn: Remove TARGET_GCN3 (*) 57af002... amdgcn: remove gfx803 "Fiji" support (*) 78dc2e2... PR modula2/116557 Remove physical address from the GPL head (*) 4bf758b... libsupc++: Fix handling of m68k extended real in (*) e4d3e7f... testsuite: Rename scanltranstree.exp -> scanltrans.exp (*) 2865719... Rename gimple_asm_input_p to gimple_asm_basic_p (*) a4b6c6a... Rename ASM_INPUT_P to ASM_BASIC_P (*) 5cbfb3a... lto/lto.cc: Fix build with not HAVE_WORKING_FORK (*) 6640a59... lto-wrapper: Honor -save-temps for ltrans' makefile (*) 571d045... ada: Diagnose too large size clause on floating-point type (*) 1c9a6d8... ada: Create usage entry for -gnatw_l (*) 2df253f... ada: Fix standard output stream for gnatcmd output (*) 91f0a3a... ada: Fix minor issues in -gnaty0's documentation (*) a004c28... ada: Documentation for generic type inference (*) 34437eb... ada: Small fixes for FreeBSD (*) cb690aa... ada: Also reset scope for some nested declaration (*) 905ab32... ada: Cleanup expansion of object declarations (*) 78acc6d... ada: Remove repeated guards in validity checks (*) 25d51fb... ranger: Fix up range computation for CLZ [PR116486] (*) 9aaedfc... load and store-lanes with SLP (*) 464067a... lower SLP load permutation to interleaving (*) eca320b... [PATCH] RISC-V: Optimize the cost of the DFmode register mo (*) 0562976... [committed][PR rtl-optimization/116544] Fix test for promot (*) f77435a... i386: Support vec_cmp for V8BF/V16BF/V32BF in AVX10.2 (*) e19f65b... i386: Support vectorized BF16 sqrt with AVX10.2 instruction (*) 29ef601... i386: Support vectorized BF16 smaxmin with AVX10.2 instruct (*) 6d294fb... i386: Support vectorized BF16 FMA with AVX10.2 instructions (*) f82fa0d... i386: Support vectorized BF16 add/sub/mul/div with AVX10.2 (*) 3b1dece... i386: Optimize generate insn for AVX10.2 compare (*) 86f5031... i386: Optimize ordered and nonequal (*) b1f9fbb... i386: Auto vectorize sdot_prod, usdot_prod, udot_prod with (*) 5239902... RISC-V: Add testcases for unsigned scalar quad and oct .SAT (*) ea81e21... RISC-V: Add testcases for unsigned scalar quad and oct .SAT (*) 56ed1df... RISC-V: Add testcases for form 4 of unsigned vector .SAT_AD (*) 72f3e90... RISC-V: Add testcases for form 3 of unsigned vector .SAT_AD (*) e96d4bf... RISC-V: Refactor gen zero_extend rtx for SAT_* when expand (*) 880834d... Daily bump. (*) 592a335... slsr: Use simple_dce_from_worklist in SLSR [PR116554] (*) f22788c... testsuite: Prune compilation messages for modules tests (*) 49fd9b3... Daily bump. (*) bac00c3... i386: Support read-modify-write memory operands in STV. (*) 2ac27bd... libobjc: Add cast to void* to disable warning for casting b (*) df89afb... AVR: Run pass avr-fuse-add a second time after pass_cprop_h (*) 60fc550... AVR: Tidy pass avr-fuse-add. (*) 7f27d1f... testsuite, c++, coroutines: Avoid 'unused' warnings [NFC]. (*) 2c27189... testsuite, c++, coroutines: Correct a test intent. (*) 049a927... c++, coroutines: Make and use a frame access helper. (*) b7e9f36... hppa: Enable PA 2.0 symbolic operands on ELF32 targets (*) ceda727... phiopt: Ignore some nop statements in heursics [PR116098] (*) 457805c... testsuite: Change what is being tested for pr66726-2.c (*) 79b5b50... Fortran: downgrade use associated namelist group name to le (*) afd9558... c++: Add unsequenced C++ testcase (*) dd346b6... c: Add support for unsequenced and reproducible attributes (*) dc476e5... AVR: Don't print a space after , when printing instructions (*) (*) This commit already exists in another branch. Because the reference `refs/users/aoliva/heads/testbase' matches your hooks.email-new-commits-only configuration, no separate email is sent for this commit.
[gcc r15-3384] [testsuite] add linkonly to dg-additional-sources [PR115295]
https://gcc.gnu.org/g:9223d1715918e4e8e7a59471b228f815b4a3467c commit r15-3384-g9223d1715918e4e8e7a59471b228f815b4a3467c Author: Alexandre Oliva Date: Mon Sep 2 11:31:51 2024 -0300 [testsuite] add linkonly to dg-additional-sources [PR115295] The D testsuite shows it was a mistake to assume that dg-additional-sources are never to be used for compilation tests. Even if an output file is specified for compilation, extra module files can be named and used in the compilation without being flagged as errors. Introduce a 'linkonly' flag for dg-additional-sources, and use it in pr95401.cc and other vector tests that default to run, so that its additional sources get discarded when vector tests downgrade to compile-only. This reverts previous workarounds for this very circumstance, that relied on being able to run vector tests anyway, even after failing to detect runtime or hardware vector support. for gcc/ChangeLog PR d/115295 * doc/sourcebuild.texi (dg-additional-sources): Add linkonly. for gcc/testsuite/ChangeLog PR d/115295 * g++.dg/vect/pr95401.cc: Add linkonly to dg-additional-sources. * g++.dg/vect/pr68762-1.cc: Likewise. * g++.dg/vect/simd-clone-3.cc: Likewise. * g++.dg/vect/simd-clone-5.cc: Likewise. * gcc.dg/vect/vect-simd-clone-10.c: Likewise. Drop dg-do run. * gcc.dg/vect/vect-simd-clone-12.c: Likewise. Likewise. * lib/gcc-defs.exp (additional_sources_omit_on_compile): New. (dg-additional-sources): Add to it on linkonly. (dg-additional-files-options): Omit select sources on compile. Diff: --- gcc/doc/sourcebuild.texi | 9 --- gcc/testsuite/g++.dg/vect/pr68762-1.cc | 2 +- gcc/testsuite/g++.dg/vect/pr95401.cc | 2 +- gcc/testsuite/g++.dg/vect/simd-clone-3.cc | 2 +- gcc/testsuite/g++.dg/vect/simd-clone-5.cc | 2 +- gcc/testsuite/gcc.dg/vect/vect-simd-clone-10.c | 4 +-- gcc/testsuite/gcc.dg/vect/vect-simd-clone-12.c | 4 +-- gcc/testsuite/lib/gcc-defs.exp | 35 -- 8 files changed, 39 insertions(+), 21 deletions(-) diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi index 0636fc0567c5..7c7094dc5a92 100644 --- a/gcc/doc/sourcebuild.texi +++ b/gcc/doc/sourcebuild.texi @@ -1328,15 +1328,16 @@ to @var{var_value} before execution of the program created by the test. Specify additional files, other than source files, that must be copied to the system where the compiler runs. -@item @{ dg-additional-sources "@var{filelist}" [@{ target @var{selector} @}] @} +@item @{ dg-additional-sources "@var{filelist}" [@{ \[linkonly\] \[target @var{selector}\] @}] @} Specify additional source files to appear in the compile line following the main test file. If the directive includes the optional @samp{@{ @var{selector} @}} then the additional sources are only added if the target system matches the @var{selector}. -Additional sources are generally used only in @samp{link} and @samp{run} -tests; they are reported as unsupported and discarded in other kinds of -tests that direct the compiler to output to a single file. +If @samp{linkonly} is specified, additional sources are used only in +@samp{link} and @samp{run} tests; they are reported as unsupported and +discarded in other kinds of tests that direct the compiler to output to +a single file. @end table @subsubsection Add checks at the end of a test diff --git a/gcc/testsuite/g++.dg/vect/pr68762-1.cc b/gcc/testsuite/g++.dg/vect/pr68762-1.cc index 118a301ab90d..53cc6e4c6dfa 100644 --- a/gcc/testsuite/g++.dg/vect/pr68762-1.cc +++ b/gcc/testsuite/g++.dg/vect/pr68762-1.cc @@ -2,7 +2,7 @@ // { dg-require-effective-target vect_simd_clones } // { dg-additional-options "-fopenmp-simd -fno-inline" } // { dg-additional-options "-mavx" { target avx_runtime } } -// { dg-additional-sources "pr68762-2.cc" } +// { dg-additional-sources "pr68762-2.cc" linkonly } #include "pr68762.h" diff --git a/gcc/testsuite/g++.dg/vect/pr95401.cc b/gcc/testsuite/g++.dg/vect/pr95401.cc index 6a56dab09572..8b1be4f24252 100644 --- a/gcc/testsuite/g++.dg/vect/pr95401.cc +++ b/gcc/testsuite/g++.dg/vect/pr95401.cc @@ -1,5 +1,5 @@ // { dg-additional-options "-mavx2 -O3" { target avx2_runtime } } -// { dg-additional-sources pr95401a.cc } +// { dg-additional-sources pr95401a.cc linkonly } extern int var_9; extern unsigned var_14; diff --git a/gcc/testsuite/g++.dg/vect/simd-clone-3.cc b/gcc/testsuite/g++.dg/vect/simd-clone-3.cc index 1057a7eb5f6a..4dd9d15d1a3b 100644 --- a/gcc/testsuite/g++.dg/vect/simd-clone-3.cc +++ b/gcc/testsuite/g++.dg/vect/simd-clone-3.cc @@ -1,7 +1,7 @@ // { dg-require-effective-target vect_simd_clones } // { dg-additional-options "-fopenmp-simd -fno-inline" } // { dg-additional-options "-mavx" { target
[gcc/aoliva/heads/testme] (186 commits) testsuite: introduce hostedlib effective target
The branch 'aoliva/heads/testme' was updated to point to: 62b70aa09f1c... testsuite: introduce hostedlib effective target It previously pointed to: beba216fee9f... [libstdc++-v3] [testsuite] improve future/*/poll.cc calibra Diff: !!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST): --- beba216... [libstdc++-v3] [testsuite] improve future/*/poll.cc calibra ce14000... Optimize initialization of small padded objects Summary of changes (added commits): --- 62b70aa... testsuite: introduce hostedlib effective target 673a448... Optimize initialization of small padded objects (*) 08693e2... Daily bump. (*) b1765a5... c++: add fixed test [PR101099] (*) ffd56dc... c++: add fixed test [PR115616] (*) f93a38f... c++: fix used but not defined warning for friend (*) b222122... Fortran: default-initialization of derived-type function re (*) 5020f8e... gdbhooks: Fix printing of vec with vl_ptr layout (*) 3fb9072... Don't remove /usr/lib and /lib from when passing to the lin (*) 4d2cbe2... middle-end: Remove integer_three_node [PR116537] (*) 04d11de... expand: Small speed up expansion of __builtin_prefetch (*) 87ce817... PR modula2/116181: m2rts fix -Wodr warning (*) d48273f... Avoid division by zero via constant_multiple_p (*) e7c7397... Do not bother with reassociation in SLP discovery for singl (*) b748e2e... c++: Allow standard attributes after closing square bracket (*) ab214ef... Check avx upper register for parallel. (*) 350d627... Daily bump. (*) aff7f67... SARIF output: implement embedded URLs in messages (§3.11.6 (*) e31b617... pretty-print: reimplement pp_format with a new struct pp_to (*) 68a0ca6... pretty-print: move class chunk_info into its own header (*) 464a3d2... Use std::unique_ptr for optinfo_item (*) 6bfeba1... Fortran: fix ICE with use with rename of namelist member [P (*) 81c4798... hppa: Fix handling of unscaled index addresses on HP-UX (*) 215c7e3... expand: Allow widdening optab when expanding popcount==1 [P (*) cdd5dd2... ada: Fix assertion failure on private limited with clause (*) d506247... ada: Fix internal error on concatenation of discriminant-de (*) a50584b... ada: Missing legality check when type completed (*) 4994069... ada: Fix missing finalization for call to function returnin (*) c2e3326... ada: Print Insertion_Sloc in dmsg (*) bb7a166... ada: Use the same warning character in continuation message (*) ad4c549... ada: Restructure continuation message for pretty printing (*) f60b53c... ada: Improve Inspection_Point warning (*) 4825bbf... ada: Avoid creating continuation messages without an intend (*) f872bba... ada: Parse the attributes of continuation messages correctl (*) 446f415... ada: Use consistent type continuations messages (*) dbaf2c0... ada: Extract line fitting algorithm (*) 299cd64... ada: Ensure validity checks for private scalar types (*) 6a3ff84... ada: Display actual line length in line length check (*) a383d7b... ada: Proper handling for iterator associations in array agg (*) 567e36c... ada: First controlling parameter aspect (*) 6b4b5b4... ada: Update documentation for conditional when constructs (*) ac6d433... Allow subregs around constant displacements [PR116516] (*) 00ec6bd... Make some smallest_int_mode_for_size calls cope with failur (*) 07e5e05... AVR: target/115830 - Make better use of SREG.N and SREG.Z. (*) d9c54e9... c++: don't remove labels during coro-early-expand-ifns [PR1 (*) bd2ccc2... AVR: Outsource code for avr-specific passes to new avr-pass (*) 4b729d2... testsuite: Fix up refactored scanltranstree.exp functions [ (*) 4ff4875... RISC-V: Fix subreg of VLS modes larger than a vector [PR116 (*) 3cb92be... i386: Support wide immediate constants in STV. (*) 155da08... Write LF_MFUNC_ID types for CodeView struct member function (*) c5043d8... Record member functions in CodeView struct definitions (*) 6a9932e... Record static data members in CodeView structs (*) 310fd68... Handle scoping in CodeView LF_FUNC_ID types (*) 3501226... Handle namespaced names for CodeView (*) 6cd806a... Daily bump. (*) 9f79c7d... c++: wrong error due to std::initializer_list opt [PR116476 (*) b8ef805... PR modula2/116181 remove ODR warnings from library interfac (*) 3c89c41... expand: Add debug dump on the cost for `popcount==1` expand (*) b68561d... libstdc++: Fix autoconf check for O_NONBLOCK in (*) 51b0fef... libstdc++: Fix -Wunused-parameter warnings in Networking TS (*) 0e2b3db... libstdc++: Fix -Wunused-variable warning in (*) a59f1cc... libstdc++: Remove unused typedef in (*) 9740a1b... doc: Add Dhruv Matani to Contributors (*) c2ad7b2... libstdc++: Fix @file for target-specific opt_random.h (*) f6ed7a6... libstdc++: Fix @headername for bits/cpp_type_traits.h (*) 898f013... AVR: Overhaul the avr-ifelse RTL optimizatio
[gcc/aoliva/heads/testbase] (185 commits) Optimize initialization of small padded objects
The branch 'aoliva/heads/testbase' was updated to point to: 673a448aa24e... Optimize initialization of small padded objects It previously pointed to: 3ff1b91e7729... Daily bump. Diff: Summary of changes (added commits): --- 673a448... Optimize initialization of small padded objects (*) 08693e2... Daily bump. (*) b1765a5... c++: add fixed test [PR101099] (*) ffd56dc... c++: add fixed test [PR115616] (*) f93a38f... c++: fix used but not defined warning for friend (*) b222122... Fortran: default-initialization of derived-type function re (*) 5020f8e... gdbhooks: Fix printing of vec with vl_ptr layout (*) 3fb9072... Don't remove /usr/lib and /lib from when passing to the lin (*) 4d2cbe2... middle-end: Remove integer_three_node [PR116537] (*) 04d11de... expand: Small speed up expansion of __builtin_prefetch (*) 87ce817... PR modula2/116181: m2rts fix -Wodr warning (*) d48273f... Avoid division by zero via constant_multiple_p (*) e7c7397... Do not bother with reassociation in SLP discovery for singl (*) b748e2e... c++: Allow standard attributes after closing square bracket (*) ab214ef... Check avx upper register for parallel. (*) 350d627... Daily bump. (*) aff7f67... SARIF output: implement embedded URLs in messages (§3.11.6 (*) e31b617... pretty-print: reimplement pp_format with a new struct pp_to (*) 68a0ca6... pretty-print: move class chunk_info into its own header (*) 464a3d2... Use std::unique_ptr for optinfo_item (*) 6bfeba1... Fortran: fix ICE with use with rename of namelist member [P (*) 81c4798... hppa: Fix handling of unscaled index addresses on HP-UX (*) 215c7e3... expand: Allow widdening optab when expanding popcount==1 [P (*) cdd5dd2... ada: Fix assertion failure on private limited with clause (*) d506247... ada: Fix internal error on concatenation of discriminant-de (*) a50584b... ada: Missing legality check when type completed (*) 4994069... ada: Fix missing finalization for call to function returnin (*) c2e3326... ada: Print Insertion_Sloc in dmsg (*) bb7a166... ada: Use the same warning character in continuation message (*) ad4c549... ada: Restructure continuation message for pretty printing (*) f60b53c... ada: Improve Inspection_Point warning (*) 4825bbf... ada: Avoid creating continuation messages without an intend (*) f872bba... ada: Parse the attributes of continuation messages correctl (*) 446f415... ada: Use consistent type continuations messages (*) dbaf2c0... ada: Extract line fitting algorithm (*) 299cd64... ada: Ensure validity checks for private scalar types (*) 6a3ff84... ada: Display actual line length in line length check (*) a383d7b... ada: Proper handling for iterator associations in array agg (*) 567e36c... ada: First controlling parameter aspect (*) 6b4b5b4... ada: Update documentation for conditional when constructs (*) ac6d433... Allow subregs around constant displacements [PR116516] (*) 00ec6bd... Make some smallest_int_mode_for_size calls cope with failur (*) 07e5e05... AVR: target/115830 - Make better use of SREG.N and SREG.Z. (*) d9c54e9... c++: don't remove labels during coro-early-expand-ifns [PR1 (*) bd2ccc2... AVR: Outsource code for avr-specific passes to new avr-pass (*) 4b729d2... testsuite: Fix up refactored scanltranstree.exp functions [ (*) 4ff4875... RISC-V: Fix subreg of VLS modes larger than a vector [PR116 (*) 3cb92be... i386: Support wide immediate constants in STV. (*) 155da08... Write LF_MFUNC_ID types for CodeView struct member function (*) c5043d8... Record member functions in CodeView struct definitions (*) 6a9932e... Record static data members in CodeView structs (*) 310fd68... Handle scoping in CodeView LF_FUNC_ID types (*) 3501226... Handle namespaced names for CodeView (*) 6cd806a... Daily bump. (*) 9f79c7d... c++: wrong error due to std::initializer_list opt [PR116476 (*) b8ef805... PR modula2/116181 remove ODR warnings from library interfac (*) 3c89c41... expand: Add debug dump on the cost for `popcount==1` expand (*) b68561d... libstdc++: Fix autoconf check for O_NONBLOCK in (*) 51b0fef... libstdc++: Fix -Wunused-parameter warnings in Networking TS (*) 0e2b3db... libstdc++: Fix -Wunused-variable warning in (*) a59f1cc... libstdc++: Remove unused typedef in (*) 9740a1b... doc: Add Dhruv Matani to Contributors (*) c2ad7b2... libstdc++: Fix @file for target-specific opt_random.h (*) f6ed7a6... libstdc++: Fix @headername for bits/cpp_type_traits.h (*) 898f013... AVR: Overhaul the avr-ifelse RTL optimization pass. (*) 6661944... Add gcc ka.po (*) 15f857a... c++: ICE with ()-init and TARGET_EXPR eliding [PR116424] (*) abeecce... aarch64: Assume zero gather/scatter set-up cost for -mtune= (*) 3e27ea2... aarch64: Fix gather x32/x64 selection (*) 035c196... aarch64: Add a test for zeroing <64bits>x2_t structures (*) 3c9338b... Tweak documentation of ASM_INPUT_P (*) bdcd
[gcc r15-3328] Optimize initialization of small padded objects
https://gcc.gnu.org/g:673a448aa24efedd5ac140ebf7bfe652d7a6a846 commit r15-3328-g673a448aa24efedd5ac140ebf7bfe652d7a6a846 Author: Alexandre Oliva Date: Sat Aug 31 06:03:12 2024 -0300 Optimize initialization of small padded objects When small objects containing padding bits (or bytes) are fully initialized, we will often store them in registers, and setting bitfields and other small fields will attempt to preserve the uninitialized padding bits, which tends to be expensive. Zero-initializing registers, OTOH, tends to be cheap. So, if we're optimizing, zero-initialize such small padded objects even if that's not needed for correctness. We can't zero-initialize all such padding objects, though: if there's no padding whatsoever, and all fields are initialized with nonzero, the zero initialization would be flagged as dead. That's why we introduce machinery to detect whether objects have padding bits. I considered distinguishing between bitfields, units and larger padding elements, but I didn't pursue that distinction. Since the object's zero-initialization subsumes fields' zero-initialization, the empty string test in builtin-snprintf-6.c's test_assign_aggregate would regress without the addition of native_encode_constructor. for gcc/ChangeLog * expr.cc (categorize_ctor_elements_1): Change p_complete to int, to distinguish complete initialization in presence or absence of uninitialized padding bits. (categorize_ctor_elements): Likewise. Adjust all callers... * expr.h (categorize_ctor_elements): ... and declaration. (type_has_padding_at_level_p): New. * gimple-fold.cc (type_has_padding_at_level_p): New. * fold-const.cc (native_encode_constructor): New. (native_encode_expr): Call it. * gimplify.cc (gimplify_init_constructor): Clear small non-addressable non-volatile objects with padding or other uninitialized fields as an optimization. for gcc/testsuite/ChangeLog * gcc.dg/init-pad-1.c: New. Diff: --- gcc/expr.cc | 20 ++-- gcc/expr.h| 3 ++- gcc/fold-const.cc | 36 gcc/gimple-fold.cc| 50 +++ gcc/gimplify.cc | 14 ++- gcc/testsuite/gcc.dg/init-pad-1.c | 18 ++ 6 files changed, 132 insertions(+), 9 deletions(-) diff --git a/gcc/expr.cc b/gcc/expr.cc index 8d17a5a39b4b..320be8b17a13 100644 --- a/gcc/expr.cc +++ b/gcc/expr.cc @@ -7096,7 +7096,7 @@ count_type_elements (const_tree type, bool for_ctor_p) static bool categorize_ctor_elements_1 (const_tree ctor, HOST_WIDE_INT *p_nz_elts, HOST_WIDE_INT *p_unique_nz_elts, - HOST_WIDE_INT *p_init_elts, bool *p_complete) + HOST_WIDE_INT *p_init_elts, int *p_complete) { unsigned HOST_WIDE_INT idx; HOST_WIDE_INT nz_elts, unique_nz_elts, init_elts, num_fields; @@ -7218,7 +7218,10 @@ categorize_ctor_elements_1 (const_tree ctor, HOST_WIDE_INT *p_nz_elts, if (*p_complete && !complete_ctor_at_level_p (TREE_TYPE (ctor), num_fields, elt_type)) -*p_complete = false; +*p_complete = 0; + else if (*p_complete > 0 + && type_has_padding_at_level_p (TREE_TYPE (ctor))) +*p_complete = -1; *p_nz_elts += nz_elts; *p_unique_nz_elts += unique_nz_elts; @@ -7239,7 +7242,10 @@ categorize_ctor_elements_1 (const_tree ctor, HOST_WIDE_INT *p_nz_elts, and place it in *P_ELT_COUNT. * whether the constructor is complete -- in the sense that every meaningful byte is explicitly given a value -- - and place it in *P_COMPLETE. + and place it in *P_COMPLETE: + - 0 if any field is missing + - 1 if all fields are initialized, and there's no padding + - -1 if all fields are initialized, but there's padding Return whether or not CTOR is a valid static constant initializer, the same as "initializer_constant_valid_p (CTOR, TREE_TYPE (CTOR)) != 0". */ @@ -7247,12 +7253,12 @@ categorize_ctor_elements_1 (const_tree ctor, HOST_WIDE_INT *p_nz_elts, bool categorize_ctor_elements (const_tree ctor, HOST_WIDE_INT *p_nz_elts, HOST_WIDE_INT *p_unique_nz_elts, - HOST_WIDE_INT *p_init_elts, bool *p_complete) + HOST_WIDE_INT *p_init_elts, int *p_complete) { *p_nz_elts = 0; *p_unique_nz_elts = 0; *p_init_elts = 0; - *p_complete = true; + *p_complete = 1; return categorize_ctor_elements_1 (ctor, p_nz_elts, p_unique_nz_elts, p_init_elts, p_complete); @@ -7313,7 +7319,7 @@ mostly_zeros_p (const_tree exp) if (TR
[gcc/aoliva/heads/testme] [libstdc++-v3] [testsuite] improve future/*/poll.cc calibra
The branch 'aoliva/heads/testme' was updated to point to: beba216fee9f... [libstdc++-v3] [testsuite] improve future/*/poll.cc calibra It previously pointed to: 73bc0445c988... [libstdc++-v3] [testsuite] improve future/*/poll.cc calibra Diff: !!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST): --- 73bc044... [libstdc++-v3] [testsuite] improve future/*/poll.cc calibra Summary of changes (added commits): --- beba216... [libstdc++-v3] [testsuite] improve future/*/poll.cc calibra
[gcc(refs/users/aoliva/heads/testme)] Optimize initialization of small padded objects
https://gcc.gnu.org/g:ce140008fff6b1e62674be932e3e908232337192 commit ce140008fff6b1e62674be932e3e908232337192 Author: Alexandre Oliva Date: Sat Aug 24 01:52:09 2024 -0300 Optimize initialization of small padded objects When small objects containing padding bits (or bytes) are fully initialized, we will often store them in registers, and setting bitfields and other small fields will attempt to preserve the uninitialized padding bits, which tends to be expensive. Zero-initializing registers, OTOH, tends to be cheap. So, if we're optimizing, zero-initialize such small padded objects even if that's not needed for correctness. We can't zero-initialize all such padding objects, though: if there's no padding whatsoever, and all fields are initialized with nonzero, the zero initialization would be flagged as dead. That's why we introduce machinery to detect whether objects have padding bits. I considered distinguishing between bitfields, units and larger padding elements, but I didn't pursue that distinction. Since the object's zero-initialization subsumes fields' zero-initialization, the empty string test in builtin-snprintf-6.c's test_assign_aggregate would regress without the addition of native_encode_constructor. for gcc/ChangeLog * expr.cc (categorize_ctor_elements_1): Change p_complete to int, to distinguish complete initialization in presence or absence of uninitialized padding bits. (categorize_ctor_elements): Likewise. Adjust all callers... * expr.h (categorize_ctor_elements): ... and declaration. (type_has_padding_at_level_p): New. * gimple-fold.cc (type_has_padding_at_level_p): New. * fold-const.cc (native_encode_constructor): New. (native_encode_expr): Call it. * gimplify.cc (gimplify_init_constructor): Clear small non-addressable non-volatile objects with padding or other uninitialized fields as an optimization. for gcc/testsuite/ChangeLog * gcc.dg/init-pad-1.c: New. Diff: --- gcc/expr.cc | 20 ++-- gcc/expr.h| 3 ++- gcc/fold-const.cc | 36 gcc/gimple-fold.cc| 50 +++ gcc/gimplify.cc | 14 ++- gcc/testsuite/gcc.dg/init-pad-1.c | 18 ++ 6 files changed, 132 insertions(+), 9 deletions(-) diff --git a/gcc/expr.cc b/gcc/expr.cc index 8d17a5a39b4b..320be8b17a13 100644 --- a/gcc/expr.cc +++ b/gcc/expr.cc @@ -7096,7 +7096,7 @@ count_type_elements (const_tree type, bool for_ctor_p) static bool categorize_ctor_elements_1 (const_tree ctor, HOST_WIDE_INT *p_nz_elts, HOST_WIDE_INT *p_unique_nz_elts, - HOST_WIDE_INT *p_init_elts, bool *p_complete) + HOST_WIDE_INT *p_init_elts, int *p_complete) { unsigned HOST_WIDE_INT idx; HOST_WIDE_INT nz_elts, unique_nz_elts, init_elts, num_fields; @@ -7218,7 +7218,10 @@ categorize_ctor_elements_1 (const_tree ctor, HOST_WIDE_INT *p_nz_elts, if (*p_complete && !complete_ctor_at_level_p (TREE_TYPE (ctor), num_fields, elt_type)) -*p_complete = false; +*p_complete = 0; + else if (*p_complete > 0 + && type_has_padding_at_level_p (TREE_TYPE (ctor))) +*p_complete = -1; *p_nz_elts += nz_elts; *p_unique_nz_elts += unique_nz_elts; @@ -7239,7 +7242,10 @@ categorize_ctor_elements_1 (const_tree ctor, HOST_WIDE_INT *p_nz_elts, and place it in *P_ELT_COUNT. * whether the constructor is complete -- in the sense that every meaningful byte is explicitly given a value -- - and place it in *P_COMPLETE. + and place it in *P_COMPLETE: + - 0 if any field is missing + - 1 if all fields are initialized, and there's no padding + - -1 if all fields are initialized, but there's padding Return whether or not CTOR is a valid static constant initializer, the same as "initializer_constant_valid_p (CTOR, TREE_TYPE (CTOR)) != 0". */ @@ -7247,12 +7253,12 @@ categorize_ctor_elements_1 (const_tree ctor, HOST_WIDE_INT *p_nz_elts, bool categorize_ctor_elements (const_tree ctor, HOST_WIDE_INT *p_nz_elts, HOST_WIDE_INT *p_unique_nz_elts, - HOST_WIDE_INT *p_init_elts, bool *p_complete) + HOST_WIDE_INT *p_init_elts, int *p_complete) { *p_nz_elts = 0; *p_unique_nz_elts = 0; *p_init_elts = 0; - *p_complete = true; + *p_complete = 1; return categorize_ctor_elements_1 (ctor, p_nz_elts, p_unique_nz_elts, p_init_elts, p_complete); @@ -7313,7 +7319,7 @@ mostly_zeros_p (const_tree exp) if (TREE_CODE (e
[gcc/aoliva/heads/testme] (223 commits) Optimize initialization of small padded objects
The branch 'aoliva/heads/testme' was updated to point to: ce140008fff6... Optimize initialization of small padded objects It previously pointed to: d2b89c77861c... Dump aliases in -fcallgraph-info Diff: !!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST): --- d2b89c7... Dump aliases in -fcallgraph-info 7b50738... Optimize initialization of small padded objects Summary of changes (added commits): --- ce14000... Optimize initialization of small padded objects 3ff1b91... Daily bump. (*) a523d1b... libstdc++: Update and clarify Doxygen version requirements (*) 5cfee93... libstdc++: Hide std::tuple internals from Doxygen docs (*) cd8e0ea... libstdc++: Improve Doxygen docs for std::allocator_traits s (*) 5dce17e... RISC-V: Use encoded nelts when calling repeating_sequence_p (*) a9f5e23... ifcvt: Do not overwrite results in noce_convert_multiple_se (*) c9e2d0e... ifcvt: disallow call instructions in noce_convert_multiple_ (*) 6e68c3d... rs6000: Fix PTImode handling in power8 swap optimization pa (*) cb51e0b... lto: Don't check obj.found for offload section (*) c429d50... libstdc++: Implement LWG 3746 for std::optional (*) 952e67c... libstdc++: Optimize __try_use_facet for const types (*) 8cf51d7... libstdc++: Fix std::allocator_traits::construct constraints (*) 43b8153... libstdc++: Only use std::time_put in std::format for non-C (*) 591b719... libstdc++: Define operator== for hash table iterators [PR11 (*) 125bab2... libstdc++: Fix std::random_shuffle for low RAND_MAX [PR8893 (*) de1923f... tree-optimization/116463 - complex lowering leaves around d (*) a35dd27... libstdc++: Make debug sequence members mutable [PR116369] (*) 9115593... libstdc++: Use noexcept insted of throw() in src/c++11/debu (*) 0bb2652... libstdc++: Simplify C++20 implementation of std::variant (*) b25b101... libstdc++: Make std::vector::reference constructor pr (*) f9f599a... Revert "Fortran: Fix class transformational intrinsic calls (*) 0798887... Match: Support form 4 for unsigned integer .SAT_TRUNC (*) b2c1d7c... libcpp: bump padding size in _cpp_convert_input [PR116458] (*) 96fe95b... optabs-query: Use opt_machine_mode for smallest_int_mode_fo (*) c22d57c... RISC-V: Expand vec abs without masking. (*) a8ae8f9... Fix test failure on powerpc targets (*) 19c22fb... ada: Fix crash on aliased variable with packed array type a (*) 87bdd17... ada: String interpolation: report error without Extensions (*) 509cc70... ada: Fix incorrect tracebacks on Windows (*) a7ff045... ada: Crash on string interpolation with custom string types (*) 7dd9c7d... ada: Implicit_Dereference aspect specification for subtype (*) 24c5396... ada: Eliminated-mode overflow check not eliminated (*) 40903c7... ada: Update libraries with the limited flag (*) dce0d46... ada: Emit a warning on inheritly limited types (*) 92a9b55... ada: First controlling parameter aspect (*) 0020cae... ada: Fix style in lines starting with assignment operator (*) ff356c0... ada: Cleanup validity of boolean operators (*) f67d108... ada: Simplify validity checks for scalar parameters (*) 8719b18... ada: Fix validity checks for named parameter associations (*) 4522f1f... ada: First controlling parameter aspect (*) aa95cd9... ada: Error missing when 'access is applied to an interface (*) 8a41af7... ada: First controlling parameter aspect (*) a071fcd... fortran: Minor fix to -ffrontend-optimize description (*) afa9080... doc: Specifically link to GPL v3.0 for GM2 (*) 0636de8... Remove unnecessary view_convert obsoleted by [PR86468]. (*) f6b10fe... testsuite: Fix vect-mod-var.c for division by 0 [PR116461] (*) 2cd783b... Daily bump. (*) da043f9... testsuite: Fix gcc.dg/torture/pr116420.c for targets defaul (*) c937773... [PR rtl-optimization/116420] Fix interesting block bitmap D (*) 8e0da56... libstdc++: Add some missing ranges feature-test macro tests (*) 792adb8... Recompute TYPE_MODE and DECL_MODE for aggregate type for ac (*) a025081... RISC-V: Fix vector cfi notes for stack-clash protection (*) 51761c5... libstdc++: Optimize std::projected (*) 6202324... libstdc++: Implement P2997R1 changes to the indirect invoca (*) b552730... libstdc++: Implement P2609R3 changes to the indirect invoca (*) a98dd53... Update LDPT_REGISTER_CLAIM_FILE_HOOK_V2 linker plugin hook (*) b07f8a3... fold: Fix `a * 1j` if a has side effects [PR116454] (*) 4e905bd... fix single argument static_assert (*) 313aa73... PR target/116365: Add user-friendly arguments to --param aa (*) 76c2954... RISC-V: Enable -gvariable-location-views by default (*) bcb33b1... Do not emit a redundant DW_TAG_lexical_block for inlined su (*) 9bbad36... PR tree-optimization/101390: Vectorize modulo operator (*) 2349609... Dump aliases in -fcallgraph-info (*) c1aba5e... Makefile.tpl: fix w
[gcc/aoliva/heads/testbase] (222 commits) Daily bump.
The branch 'aoliva/heads/testbase' was updated to point to: 3ff1b91e7729... Daily bump. It previously pointed to: 4d2e8fcdaf32... Daily bump. Diff: Summary of changes (added commits): --- 3ff1b91... Daily bump. (*) a523d1b... libstdc++: Update and clarify Doxygen version requirements (*) 5cfee93... libstdc++: Hide std::tuple internals from Doxygen docs (*) cd8e0ea... libstdc++: Improve Doxygen docs for std::allocator_traits s (*) 5dce17e... RISC-V: Use encoded nelts when calling repeating_sequence_p (*) a9f5e23... ifcvt: Do not overwrite results in noce_convert_multiple_se (*) c9e2d0e... ifcvt: disallow call instructions in noce_convert_multiple_ (*) 6e68c3d... rs6000: Fix PTImode handling in power8 swap optimization pa (*) cb51e0b... lto: Don't check obj.found for offload section (*) c429d50... libstdc++: Implement LWG 3746 for std::optional (*) 952e67c... libstdc++: Optimize __try_use_facet for const types (*) 8cf51d7... libstdc++: Fix std::allocator_traits::construct constraints (*) 43b8153... libstdc++: Only use std::time_put in std::format for non-C (*) 591b719... libstdc++: Define operator== for hash table iterators [PR11 (*) 125bab2... libstdc++: Fix std::random_shuffle for low RAND_MAX [PR8893 (*) de1923f... tree-optimization/116463 - complex lowering leaves around d (*) a35dd27... libstdc++: Make debug sequence members mutable [PR116369] (*) 9115593... libstdc++: Use noexcept insted of throw() in src/c++11/debu (*) 0bb2652... libstdc++: Simplify C++20 implementation of std::variant (*) b25b101... libstdc++: Make std::vector::reference constructor pr (*) f9f599a... Revert "Fortran: Fix class transformational intrinsic calls (*) 0798887... Match: Support form 4 for unsigned integer .SAT_TRUNC (*) b2c1d7c... libcpp: bump padding size in _cpp_convert_input [PR116458] (*) 96fe95b... optabs-query: Use opt_machine_mode for smallest_int_mode_fo (*) c22d57c... RISC-V: Expand vec abs without masking. (*) a8ae8f9... Fix test failure on powerpc targets (*) 19c22fb... ada: Fix crash on aliased variable with packed array type a (*) 87bdd17... ada: String interpolation: report error without Extensions (*) 509cc70... ada: Fix incorrect tracebacks on Windows (*) a7ff045... ada: Crash on string interpolation with custom string types (*) 7dd9c7d... ada: Implicit_Dereference aspect specification for subtype (*) 24c5396... ada: Eliminated-mode overflow check not eliminated (*) 40903c7... ada: Update libraries with the limited flag (*) dce0d46... ada: Emit a warning on inheritly limited types (*) 92a9b55... ada: First controlling parameter aspect (*) 0020cae... ada: Fix style in lines starting with assignment operator (*) ff356c0... ada: Cleanup validity of boolean operators (*) f67d108... ada: Simplify validity checks for scalar parameters (*) 8719b18... ada: Fix validity checks for named parameter associations (*) 4522f1f... ada: First controlling parameter aspect (*) aa95cd9... ada: Error missing when 'access is applied to an interface (*) 8a41af7... ada: First controlling parameter aspect (*) a071fcd... fortran: Minor fix to -ffrontend-optimize description (*) afa9080... doc: Specifically link to GPL v3.0 for GM2 (*) 0636de8... Remove unnecessary view_convert obsoleted by [PR86468]. (*) f6b10fe... testsuite: Fix vect-mod-var.c for division by 0 [PR116461] (*) 2cd783b... Daily bump. (*) da043f9... testsuite: Fix gcc.dg/torture/pr116420.c for targets defaul (*) c937773... [PR rtl-optimization/116420] Fix interesting block bitmap D (*) 8e0da56... libstdc++: Add some missing ranges feature-test macro tests (*) 792adb8... Recompute TYPE_MODE and DECL_MODE for aggregate type for ac (*) a025081... RISC-V: Fix vector cfi notes for stack-clash protection (*) 51761c5... libstdc++: Optimize std::projected (*) 6202324... libstdc++: Implement P2997R1 changes to the indirect invoca (*) b552730... libstdc++: Implement P2609R3 changes to the indirect invoca (*) a98dd53... Update LDPT_REGISTER_CLAIM_FILE_HOOK_V2 linker plugin hook (*) b07f8a3... fold: Fix `a * 1j` if a has side effects [PR116454] (*) 4e905bd... fix single argument static_assert (*) 313aa73... PR target/116365: Add user-friendly arguments to --param aa (*) 76c2954... RISC-V: Enable -gvariable-location-views by default (*) bcb33b1... Do not emit a redundant DW_TAG_lexical_block for inlined su (*) 9bbad36... PR tree-optimization/101390: Vectorize modulo operator (*) 2349609... Dump aliases in -fcallgraph-info (*) c1aba5e... Makefile.tpl: fix whitespace in licence header (*) d6a112a... Makefile.tpl: drop leftover intermodule cruft (*) 6ea25c0... Align ix86_{move_max,store_max} with vectorizer. (*) f155534... Daily bump. (*) 91f2139... RISC-V: Add testcases for unsigned vector .SAT_TRUNC form 3 (*) 1e99e1b... RISC-V: Add testcases for unsigned vector .SAT_TRUNC form 2 (*) cdc9cd4... [PR r
[gcc r15-3081] Dump aliases in -fcallgraph-info
https://gcc.gnu.org/g:23496098bba769044ed352c0d7bdb317477c16ac commit r15-3081-g23496098bba769044ed352c0d7bdb317477c16ac Author: Alexandre Oliva Date: Thu Aug 22 01:27:55 2024 -0300 Dump aliases in -fcallgraph-info Dump ICF-unified decls, thunks, aliases and whatnot along with their ultimate targets, with edges from the alias to the target. Add support for dropping the source file's suffix when forming from dump-base, so that auxiliary files can be scanned, such as the .ci files generated by -fcallgraph-info, as in the testcase. for gcc/ChangeLog * toplev.cc (dump_final_alias_vcg): New. (dump_final_node_vcg): Dump aliases along with node. for gcc/testsuite/ChangeLog * lib/scandump.exp (dump-base): Support {} in dump base suffix to drop it. * gcc.dg/callgraph-info-1.c: New. Diff: --- gcc/testsuite/gcc.dg/callgraph-info-1.c | 7 +++ gcc/testsuite/lib/scandump.exp | 4 gcc/toplev.cc | 37 + 3 files changed, 48 insertions(+) diff --git a/gcc/testsuite/gcc.dg/callgraph-info-1.c b/gcc/testsuite/gcc.dg/callgraph-info-1.c new file mode 100644 index ..853ff9554eeb --- /dev/null +++ b/gcc/testsuite/gcc.dg/callgraph-info-1.c @@ -0,0 +1,7 @@ +/* { dg-do compile } */ +/* { dg-options "-fcallgraph-info" } */ + +void f() {} +void g() __attribute__ ((__alias__ ("f"))); + +/* { dg-final { scan-dump-times "ci" "triangle" 1 "ci" {{}} } } */ diff --git a/gcc/testsuite/lib/scandump.exp b/gcc/testsuite/lib/scandump.exp index 14536ae7379b..adf9886b61c9 100644 --- a/gcc/testsuite/lib/scandump.exp +++ b/gcc/testsuite/lib/scandump.exp @@ -37,6 +37,10 @@ proc dump-base { args } { # gcc-defs to base compilation dumps only on the source basename. set dumpbase $src if { [string length $dumpbase_suf] != 0 } { + # Accept {} as dump base suffix to drop the source suffix entirely. + if { "$dumpbase_suf" == "{}" } { + set dumpbase_suf "" + } regsub {[.][^.]*$} $src $dumpbase_suf dumpbase } return $dumpbase diff --git a/gcc/toplev.cc b/gcc/toplev.cc index eee4805b504a..f308fb151083 100644 --- a/gcc/toplev.cc +++ b/gcc/toplev.cc @@ -914,6 +914,37 @@ dump_final_callee_vcg (FILE *f, location_t location, tree callee) fputs ("\" }\n", f); } +/* Callback for cgraph_node::call_for_symbol_thunks_and_aliases to dump to F_ a + node and an edge from ALIAS->DECL to CURRENT_FUNCTION_DECL. */ + +static bool +dump_final_alias_vcg (cgraph_node *alias, void *f_) +{ + FILE *f = (FILE *)f_; + + if (alias->decl == current_function_decl) +return false; + + dump_final_node_vcg_start (f, alias->decl); + fputs ("\" shape : triangle }\n", f); + + fputs ("edge: { sourcename: \"", f); + print_decl_identifier (f, alias->decl, PRINT_DECL_UNIQUE_NAME); + fputs ("\" targetname: \"", f); + print_decl_identifier (f, current_function_decl, PRINT_DECL_UNIQUE_NAME); + location_t location = DECL_SOURCE_LOCATION (alias->decl); + if (LOCATION_LOCUS (location) != UNKNOWN_LOCATION) +{ + expanded_location loc; + fputs ("\" label: \"", f); + loc = expand_location (location); + fprintf (f, "%s:%d:%d", loc.file, loc.line, loc.column); +} + fputs ("\" }\n", f); + + return false; +} + /* Dump final cgraph node in VCG format. */ static void @@ -950,6 +981,12 @@ dump_final_node_vcg (FILE *f) dump_final_callee_vcg (f, c->location, c->decl); vec_free (cfun->su->callees); cfun->su->callees = NULL; + + cgraph_node *node = cgraph_node::get (current_function_decl); + if (!node) +return; + node->call_for_symbol_thunks_and_aliases (dump_final_alias_vcg, f, + true, false); } /* Output stack usage and callgraph info, as requested. */
[gcc(refs/users/aoliva/heads/testme)] Dump aliases in -fcallgraph-info
https://gcc.gnu.org/g:d2b89c77861c4a773efada3954e910b6623f8eb5 commit d2b89c77861c4a773efada3954e910b6623f8eb5 Author: Alexandre Oliva Date: Thu Aug 15 02:00:18 2024 -0300 Dump aliases in -fcallgraph-info Dump ICF-unified decls, thunks, aliases and whatnot along with their ultimate targets, with edges from the alias to the target. Add support for dropping the source file's suffix when forming from dump-base, so that auxiliary files can be scanned, such as the .ci files generated by -fcallgraph-info, as in the testcase. for gcc/ChangeLog * toplev.cc (dump_final_alias_vcg): New. (dump_final_node_vcg): Dump aliases along with node. for gcc/testsuite/ChangeLog * lib/scandump.exp (dump-base): Support {} in dump base suffix to drop it. * gcc.dg/callgraph-info-1.c: New. Diff: --- gcc/testsuite/gcc.dg/callgraph-info-1.c | 7 +++ gcc/testsuite/lib/scandump.exp | 4 gcc/toplev.cc | 37 + 3 files changed, 48 insertions(+) diff --git a/gcc/testsuite/gcc.dg/callgraph-info-1.c b/gcc/testsuite/gcc.dg/callgraph-info-1.c new file mode 100644 index 000..853ff9554ee --- /dev/null +++ b/gcc/testsuite/gcc.dg/callgraph-info-1.c @@ -0,0 +1,7 @@ +/* { dg-do compile } */ +/* { dg-options "-fcallgraph-info" } */ + +void f() {} +void g() __attribute__ ((__alias__ ("f"))); + +/* { dg-final { scan-dump-times "ci" "triangle" 1 "ci" {{}} } } */ diff --git a/gcc/testsuite/lib/scandump.exp b/gcc/testsuite/lib/scandump.exp index 14536ae7379..adf9886b61c 100644 --- a/gcc/testsuite/lib/scandump.exp +++ b/gcc/testsuite/lib/scandump.exp @@ -37,6 +37,10 @@ proc dump-base { args } { # gcc-defs to base compilation dumps only on the source basename. set dumpbase $src if { [string length $dumpbase_suf] != 0 } { + # Accept {} as dump base suffix to drop the source suffix entirely. + if { "$dumpbase_suf" == "{}" } { + set dumpbase_suf "" + } regsub {[.][^.]*$} $src $dumpbase_suf dumpbase } return $dumpbase diff --git a/gcc/toplev.cc b/gcc/toplev.cc index eee4805b504..f308fb15108 100644 --- a/gcc/toplev.cc +++ b/gcc/toplev.cc @@ -914,6 +914,37 @@ dump_final_callee_vcg (FILE *f, location_t location, tree callee) fputs ("\" }\n", f); } +/* Callback for cgraph_node::call_for_symbol_thunks_and_aliases to dump to F_ a + node and an edge from ALIAS->DECL to CURRENT_FUNCTION_DECL. */ + +static bool +dump_final_alias_vcg (cgraph_node *alias, void *f_) +{ + FILE *f = (FILE *)f_; + + if (alias->decl == current_function_decl) +return false; + + dump_final_node_vcg_start (f, alias->decl); + fputs ("\" shape : triangle }\n", f); + + fputs ("edge: { sourcename: \"", f); + print_decl_identifier (f, alias->decl, PRINT_DECL_UNIQUE_NAME); + fputs ("\" targetname: \"", f); + print_decl_identifier (f, current_function_decl, PRINT_DECL_UNIQUE_NAME); + location_t location = DECL_SOURCE_LOCATION (alias->decl); + if (LOCATION_LOCUS (location) != UNKNOWN_LOCATION) +{ + expanded_location loc; + fputs ("\" label: \"", f); + loc = expand_location (location); + fprintf (f, "%s:%d:%d", loc.file, loc.line, loc.column); +} + fputs ("\" }\n", f); + + return false; +} + /* Dump final cgraph node in VCG format. */ static void @@ -950,6 +981,12 @@ dump_final_node_vcg (FILE *f) dump_final_callee_vcg (f, c->location, c->decl); vec_free (cfun->su->callees); cfun->su->callees = NULL; + + cgraph_node *node = cgraph_node::get (current_function_decl); + if (!node) +return; + node->call_for_symbol_thunks_and_aliases (dump_final_alias_vcg, f, + true, false); } /* Output stack usage and callgraph info, as requested. */
[gcc(refs/users/aoliva/heads/testme)] Optimize initialization of small padded objects
https://gcc.gnu.org/g:7b50738c0cce248ecb98c8e2bf5f8115c4a90e74 commit 7b50738c0cce248ecb98c8e2bf5f8115c4a90e74 Author: Alexandre Oliva Date: Wed Aug 14 21:59:28 2024 -0300 Optimize initialization of small padded objects When small objects containing padding bits (or bytes) are fully initialized, we will often store them in registers, and setting bitfields and other small fields will attempt to preserve the uninitialized padding bits, which tends to be expensive. Zero-initializing registers, OTOH, tends to be cheap. So, if we're optimizing, zero-initialize such small padded objects even if that's not needed for correctness. We can't zero-initialize all such padding objects, though: if there's no padding whatsoever, and all fields are initialized with nonzero, the zero initialization would be flagged as dead. That's why we introduce machinery to detect whether objects have padding bits. I considered distinguishing between bitfields, units and larger padding elements, but I didn't pursue that distinction. Since the object's zero-initialization subsumes fields' zero-initialization, the empty string test in builtin-snprintf-6.c's test_assign_aggregate would regress without the addition of native_encode_constructor. for gcc/ChangeLog * expr.cc (categorize_ctor_elements_1): Change p_complete to int, to distinguish complete initialization in presence or absence of uninitialized padding bits. (categorize_ctor_elements): Likewise. Adjust all callers... * expr.h (categorize_ctor_elements): ... and declaration. (type_has_padding_at_level_p): New. * gimple-fold.cc (type_has_padding_at_level_p): New. * fold-const.cc (native_encode_constructor): New. (native_encode_expr): Call it. * gimplify.cc (gimplify_init_constructor): Clear small non-addressable non-volatile objects with padding or other uninitialized fields as an optimization. for gcc/testsuite/ChangeLog * gcc.dg/init-pad-1.c: New. Diff: --- gcc/expr.cc | 20 ++-- gcc/expr.h| 3 ++- gcc/fold-const.cc | 33 ++ gcc/gimple-fold.cc| 50 +++ gcc/gimplify.cc | 14 ++- gcc/testsuite/gcc.dg/init-pad-1.c | 18 ++ 6 files changed, 129 insertions(+), 9 deletions(-) diff --git a/gcc/expr.cc b/gcc/expr.cc index 2089c2b86a9..a701c67b348 100644 --- a/gcc/expr.cc +++ b/gcc/expr.cc @@ -7096,7 +7096,7 @@ count_type_elements (const_tree type, bool for_ctor_p) static bool categorize_ctor_elements_1 (const_tree ctor, HOST_WIDE_INT *p_nz_elts, HOST_WIDE_INT *p_unique_nz_elts, - HOST_WIDE_INT *p_init_elts, bool *p_complete) + HOST_WIDE_INT *p_init_elts, int *p_complete) { unsigned HOST_WIDE_INT idx; HOST_WIDE_INT nz_elts, unique_nz_elts, init_elts, num_fields; @@ -7218,7 +7218,10 @@ categorize_ctor_elements_1 (const_tree ctor, HOST_WIDE_INT *p_nz_elts, if (*p_complete && !complete_ctor_at_level_p (TREE_TYPE (ctor), num_fields, elt_type)) -*p_complete = false; +*p_complete = 0; + else if (*p_complete > 0 + && type_has_padding_at_level_p (TREE_TYPE (ctor))) +*p_complete = -1; *p_nz_elts += nz_elts; *p_unique_nz_elts += unique_nz_elts; @@ -7239,7 +7242,10 @@ categorize_ctor_elements_1 (const_tree ctor, HOST_WIDE_INT *p_nz_elts, and place it in *P_ELT_COUNT. * whether the constructor is complete -- in the sense that every meaningful byte is explicitly given a value -- - and place it in *P_COMPLETE. + and place it in *P_COMPLETE: + - 0 if any field is missing + - 1 if all fields are initialized, and there's no padding + - -1 if all fields are initialized, but there's padding Return whether or not CTOR is a valid static constant initializer, the same as "initializer_constant_valid_p (CTOR, TREE_TYPE (CTOR)) != 0". */ @@ -7247,12 +7253,12 @@ categorize_ctor_elements_1 (const_tree ctor, HOST_WIDE_INT *p_nz_elts, bool categorize_ctor_elements (const_tree ctor, HOST_WIDE_INT *p_nz_elts, HOST_WIDE_INT *p_unique_nz_elts, - HOST_WIDE_INT *p_init_elts, bool *p_complete) + HOST_WIDE_INT *p_init_elts, int *p_complete) { *p_nz_elts = 0; *p_unique_nz_elts = 0; *p_init_elts = 0; - *p_complete = true; + *p_complete = 1; return categorize_ctor_elements_1 (ctor, p_nz_elts, p_unique_nz_elts, p_init_elts, p_complete); @@ -7313,7 +7319,7 @@ mostly_zeros_p (const_tree exp) if (TREE_CODE (exp)
[gcc/aoliva/heads/testme] (2 commits) Dump aliases in -fcallgraph-info
The branch 'aoliva/heads/testme' was updated to point to: d2b89c77861... Dump aliases in -fcallgraph-info It previously pointed to: 17d9d479afd... Dump aliases in -fcallgraph-info Diff: !!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST): --- 17d9d47... Dump aliases in -fcallgraph-info ebf9b1b... Optimize initialization of small padded objects Summary of changes (added commits): --- d2b89c7... Dump aliases in -fcallgraph-info 7b50738... Optimize initialization of small padded objects
[gcc(refs/users/aoliva/heads/testme)] Dump aliases in -fcallgraph-info
https://gcc.gnu.org/g:17d9d479afd4de2939c2d507691394ff32983296 commit 17d9d479afd4de2939c2d507691394ff32983296 Author: Alexandre Oliva Date: Thu Aug 15 02:00:18 2024 -0300 Dump aliases in -fcallgraph-info Dump ICF-unified decls, thunks, aliases and whatnot along with their ultimate targets, with edges from the alias to the target. for gcc/ChangeLog * toplev.cc (dump_final_alias_vcg): New. (dump_final_node_vcg): Dump aliases along with node. Diff: --- gcc/toplev.cc | 37 + 1 file changed, 37 insertions(+) diff --git a/gcc/toplev.cc b/gcc/toplev.cc index eee4805b504..f308fb15108 100644 --- a/gcc/toplev.cc +++ b/gcc/toplev.cc @@ -914,6 +914,37 @@ dump_final_callee_vcg (FILE *f, location_t location, tree callee) fputs ("\" }\n", f); } +/* Callback for cgraph_node::call_for_symbol_thunks_and_aliases to dump to F_ a + node and an edge from ALIAS->DECL to CURRENT_FUNCTION_DECL. */ + +static bool +dump_final_alias_vcg (cgraph_node *alias, void *f_) +{ + FILE *f = (FILE *)f_; + + if (alias->decl == current_function_decl) +return false; + + dump_final_node_vcg_start (f, alias->decl); + fputs ("\" shape : triangle }\n", f); + + fputs ("edge: { sourcename: \"", f); + print_decl_identifier (f, alias->decl, PRINT_DECL_UNIQUE_NAME); + fputs ("\" targetname: \"", f); + print_decl_identifier (f, current_function_decl, PRINT_DECL_UNIQUE_NAME); + location_t location = DECL_SOURCE_LOCATION (alias->decl); + if (LOCATION_LOCUS (location) != UNKNOWN_LOCATION) +{ + expanded_location loc; + fputs ("\" label: \"", f); + loc = expand_location (location); + fprintf (f, "%s:%d:%d", loc.file, loc.line, loc.column); +} + fputs ("\" }\n", f); + + return false; +} + /* Dump final cgraph node in VCG format. */ static void @@ -950,6 +981,12 @@ dump_final_node_vcg (FILE *f) dump_final_callee_vcg (f, c->location, c->decl); vec_free (cfun->su->callees); cfun->su->callees = NULL; + + cgraph_node *node = cgraph_node::get (current_function_decl); + if (!node) +return; + node->call_for_symbol_thunks_and_aliases (dump_final_alias_vcg, f, + true, false); } /* Output stack usage and callgraph info, as requested. */
[gcc(refs/users/aoliva/heads/testme)] Optimize initialization of small padded objects
https://gcc.gnu.org/g:ebf9b1becc8cf76421f1741ac8084d139abd49db commit ebf9b1becc8cf76421f1741ac8084d139abd49db Author: Alexandre Oliva Date: Wed Aug 14 21:59:28 2024 -0300 Optimize initialization of small padded objects When small objects containing padding bits (or bytes) are fully initialized, we will often store them in registers, and setting bitfields and other small fields will attempt to preserve the uninitialized padding bits, which tends to be expensive. Zero-initializing registers, OTOH, tends to be cheap. So, if we're optimizing, zero-initialize such small padded objects even if that's not needed for correctness. We can't zero-initialize all such padding objects, though: if there's no padding whatsoever, and all fields are initialized with nonzero, the zero initialization would be flagged as dead. That's why we introduce machinery to detect whether objects have padding bits. I considered distinguishing between bitfields, units and larger padding elements, but I didn't pursue that distinction. Since the object's zero-initialization subsumes fields' zero-initialization, the empty string test in builtin-snprintf-6.c's test_assign_aggregate would regress without the addition of native_encode_constructor. for gcc/ChangeLog * expr.cc (categorize_ctor_elements_1): Change p_complete to int, to distinguish complete initialization in presence or absence of uninitialized padding bits. (categorize_ctor_elements): Likewise. Adjust all callers... * expr.h (categorize_ctor_elements): ... and declaration. (type_has_padding_at_level_p): New. * gimple-fold.cc (type_has_padding_at_level_p): New. * fold-const.cc (native_encode_constructor): New. (native_encode_expr): Call it. * gimplify.cc (gimplify_init_constructor): Clear small non-addressable non-volatile objects with padding or other uninitialized fields as an optimization. Diff: --- gcc/expr.cc| 20 +--- gcc/expr.h | 3 ++- gcc/fold-const.cc | 33 + gcc/gimple-fold.cc | 50 ++ gcc/gimplify.cc| 14 +- 5 files changed, 111 insertions(+), 9 deletions(-) diff --git a/gcc/expr.cc b/gcc/expr.cc index 2089c2b86a9..a701c67b348 100644 --- a/gcc/expr.cc +++ b/gcc/expr.cc @@ -7096,7 +7096,7 @@ count_type_elements (const_tree type, bool for_ctor_p) static bool categorize_ctor_elements_1 (const_tree ctor, HOST_WIDE_INT *p_nz_elts, HOST_WIDE_INT *p_unique_nz_elts, - HOST_WIDE_INT *p_init_elts, bool *p_complete) + HOST_WIDE_INT *p_init_elts, int *p_complete) { unsigned HOST_WIDE_INT idx; HOST_WIDE_INT nz_elts, unique_nz_elts, init_elts, num_fields; @@ -7218,7 +7218,10 @@ categorize_ctor_elements_1 (const_tree ctor, HOST_WIDE_INT *p_nz_elts, if (*p_complete && !complete_ctor_at_level_p (TREE_TYPE (ctor), num_fields, elt_type)) -*p_complete = false; +*p_complete = 0; + else if (*p_complete > 0 + && type_has_padding_at_level_p (TREE_TYPE (ctor))) +*p_complete = -1; *p_nz_elts += nz_elts; *p_unique_nz_elts += unique_nz_elts; @@ -7239,7 +7242,10 @@ categorize_ctor_elements_1 (const_tree ctor, HOST_WIDE_INT *p_nz_elts, and place it in *P_ELT_COUNT. * whether the constructor is complete -- in the sense that every meaningful byte is explicitly given a value -- - and place it in *P_COMPLETE. + and place it in *P_COMPLETE: + - 0 if any field is missing + - 1 if all fields are initialized, and there's no padding + - -1 if all fields are initialized, but there's padding Return whether or not CTOR is a valid static constant initializer, the same as "initializer_constant_valid_p (CTOR, TREE_TYPE (CTOR)) != 0". */ @@ -7247,12 +7253,12 @@ categorize_ctor_elements_1 (const_tree ctor, HOST_WIDE_INT *p_nz_elts, bool categorize_ctor_elements (const_tree ctor, HOST_WIDE_INT *p_nz_elts, HOST_WIDE_INT *p_unique_nz_elts, - HOST_WIDE_INT *p_init_elts, bool *p_complete) + HOST_WIDE_INT *p_init_elts, int *p_complete) { *p_nz_elts = 0; *p_unique_nz_elts = 0; *p_init_elts = 0; - *p_complete = true; + *p_complete = 1; return categorize_ctor_elements_1 (ctor, p_nz_elts, p_unique_nz_elts, p_init_elts, p_complete); @@ -7313,7 +7319,7 @@ mostly_zeros_p (const_tree exp) if (TREE_CODE (exp) == CONSTRUCTOR) { HOST_WIDE_INT nz_elts, unz_elts, init_elts; - bool complete_p; + int complete_p; categorize_ctor_elements (exp, &nz_elts, &unz_elts, &init_
[gcc/aoliva/heads/testme] (2 commits) Dump aliases in -fcallgraph-info
The branch 'aoliva/heads/testme' was updated to point to: 17d9d479afd... Dump aliases in -fcallgraph-info It previously pointed to: 8152f1f5491... optimize initialization of small padded objects Diff: !!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST): --- 8152f1f... optimize initialization of small padded objects Summary of changes (added commits): --- 17d9d47... Dump aliases in -fcallgraph-info ebf9b1b... Optimize initialization of small padded objects
[gcc/aoliva/heads/testme] (710 commits) optimize initialization of small padded objects
The branch 'aoliva/heads/testme' was updated to point to: 8152f1f54917... optimize initialization of small padded objects It previously pointed to: 9d90ad447ba1... [libstdc++] [testsuite] avoid async.cc loss of precision [P Diff: !!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST): --- 9d90ad4... [libstdc++] [testsuite] avoid async.cc loss of precision [P Summary of changes (added commits): --- 8152f1f... optimize initialization of small padded objects 4d2e8fc... Daily bump. (*) d91b6c9... c++: ICE with NSDMIs and fn arguments [PR116015] (*) a247088... s390: Remove vector intrinsics (*) e8a7142... s390: Fix high-level builtins vec_gfmsum{,_accum}_128 (*) a82c4df... Fortran: fix minor frontend GMP leaks (*) edb2712... i386: Optimization for APX NDD is always zero-uppered for s (*) d08a5f2... i386: Optimization for APX NDD is always zero-uppered for l (*) 1b76174... i386: Optimization for APX NDD is always zero-uppered for s (*) a302cd6... i386: Optimization for APX NDD is always zero-uppered for A (*) 42aba47... Restrict pr116202-run-1.c test to riscv_v target (*) 54be14b... Prevent future proc_ptr parsing issues in associate [PR1029 (*) bb23247... Fix ICE in build_function_decl [PR116292] (*) ca7936f... genoutput: Accelerate the place_operands function. (*) e4f9a87... Revert "[rtl-optimization/116244] Don't create bogus regs i (*) 10972e6... testsuite: Fix fam-in-union-alone-in-struct-2.c with unsign (*) c3c83d2... Move ix86_align_loops into a separate pass and insert the p (*) 9045ab7... Daily bump. (*) b13e346... testsuite: Fix struct size check [PR116155] (*) cc00a73... ifcvt: Fix force_operand ICE in noce_convert_multiple_sets (*) 9988d7e... Fortran: reject array constructor value of abstract type [P (*) ccd7068... RISC-V: Fix non-obvious comment typos (*) 5618b02... Internal-fn: Handle vector bool type for type strict match (*) 49d5e21... LRA: Don't emit move for substituted CONSTATNT_P operand [P (*) bee532c... Regenerate avr.opt.urls (*) 3f1e15e... Daily bump. (*) 0451bc5... rs6000: ROP - Do not disable shrink-wrapping for leaf funct (*) ef90a13... RISC-V: Fix missing abi arg in test (*) e9738e7... [rtl-optimization/116244] Don't create bogus regs in alter_ (*) edc47d3... borrowck: Fix debug prints on 32-bits architectures (*) 12028d7... borrowck: Avoid overloading issues on 32bit architectures (*) b219cbe... ifcvt: Handle multiple rewired regs and refactor noce_conve (*) 72c9b5f... ifcvt: Allow more operations in multiple set if conversion (*) 28b3812... ifcvt: handle sequences that clobber flags in noce_convert_ (*) 68da681... AVR: target/85624 - Fix non-matching alignment in clrmem* i (*) 24df2ab... 16-bit testsuite fixes - excessive code size (*) 46bd63d... This fixes problems with tests that exceed a data type or t (*) 40b9a7b... Avoid cfg corruption when using sjlj exceptions where loops (*) 9ab8681... Use splay-tree-utils.h in tree-ssa-sccvn [PR30920] (*) fcc766c... aarch64: Emit ADD X, Y, Y instead of SHL X, Y, #1 for Advan (*) 8d8db21... Fortran: Fix coarray in associate not linking [PR85510] (*) 4bcb480... Initial support for AVX10.2 (*) 7a970bd... PR target/116275: Handle STV of *extenddi2_doubleword_highp (*) 7bf4cd4... LoongArch: Provide ashr lshr and ashl RTL pattern for vecto (*) 0498f8b... LoongArch: Drop vcond{,u} expanders. (*) 75e852b... LoongArch: Use iorn and andn standard pattern names. (*) 9f3b5c2... PR modula2/116181 fix ODR warnings for C/m2 interface libra (*) f09be22... Daily bump. (*) 2b23a44... Fortran: silence Wmaybe-uninitialized warnings for LTO buil (*) 149a23e... AVR: -mlra is not documeted in TEXI. (*) 29a3236... AVR: Add function avr.cc::ra_in_progress(). (*) 19c9ba0... Daily bump. (*) 8035619... i386: testsuite: Adapt fentryname3.c for r14-811 change [PR (*) 331f7d8... i386: testsuite: Add -no-pie for pr113689-1.c [PR70150] (*) 85a6073... Fix reference to the dom walker function in the documentati (*) 16ce781... gm2: add missing debug output guard (*) 9d5c500... testsuite: Fix up sse3-addsubps.c (*) 09a87ea... AVR: ad target/113934 - Add option -mlra to enable LRA. (*) 8cc67b5... c++: inherited CTAD fixes [PR116276] (*) 70da0ca... c++: DECL_UNINSTANTIATED_TEMPLATE_FRIEND_P tweaks (*) cf7feae... c++: clean up cp_identifier_kind checks (*) 6b4b27a... Daily bump. (*) d4e1290... [RISC-V][PR target/116283] Fix split code for recent Zbs im (*) 4734c1b... Revert "lra: emit caller-save register spills before call i (*) 9e4da94... Adjust rangers recomputation depth based on the number of B (*) 5ce3874... Limit equivalency processing in rangers cache. (*) d0bc1cb... btf: Protect BTF_KIND_INFO against invalid kind (*) 786ebbd... c++: Don't accept multiple enum definitions within template (*) 180ede3... RISC-V: Enab
[gcc(refs/users/aoliva/heads/testme)] optimize initialization of small padded objects
https://gcc.gnu.org/g:8152f1f549179b377634b7ec360e6907fdd528c1 commit 8152f1f549179b377634b7ec360e6907fdd528c1 Author: Alexandre Oliva Date: Wed Aug 14 21:59:28 2024 -0300 optimize initialization of small padded objects Diff: --- gcc/expr.cc| 20 +--- gcc/expr.h | 3 ++- gcc/fold-const.cc | 33 + gcc/gimple-fold.cc | 50 ++ gcc/gimplify.cc| 14 +- 5 files changed, 111 insertions(+), 9 deletions(-) diff --git a/gcc/expr.cc b/gcc/expr.cc index 2089c2b86a98..a701c67b3485 100644 --- a/gcc/expr.cc +++ b/gcc/expr.cc @@ -7096,7 +7096,7 @@ count_type_elements (const_tree type, bool for_ctor_p) static bool categorize_ctor_elements_1 (const_tree ctor, HOST_WIDE_INT *p_nz_elts, HOST_WIDE_INT *p_unique_nz_elts, - HOST_WIDE_INT *p_init_elts, bool *p_complete) + HOST_WIDE_INT *p_init_elts, int *p_complete) { unsigned HOST_WIDE_INT idx; HOST_WIDE_INT nz_elts, unique_nz_elts, init_elts, num_fields; @@ -7218,7 +7218,10 @@ categorize_ctor_elements_1 (const_tree ctor, HOST_WIDE_INT *p_nz_elts, if (*p_complete && !complete_ctor_at_level_p (TREE_TYPE (ctor), num_fields, elt_type)) -*p_complete = false; +*p_complete = 0; + else if (*p_complete > 0 + && type_has_padding_at_level_p (TREE_TYPE (ctor))) +*p_complete = -1; *p_nz_elts += nz_elts; *p_unique_nz_elts += unique_nz_elts; @@ -7239,7 +7242,10 @@ categorize_ctor_elements_1 (const_tree ctor, HOST_WIDE_INT *p_nz_elts, and place it in *P_ELT_COUNT. * whether the constructor is complete -- in the sense that every meaningful byte is explicitly given a value -- - and place it in *P_COMPLETE. + and place it in *P_COMPLETE: + - 0 if any field is missing + - 1 if all fields are initialized, and there's no padding + - -1 if all fields are initialized, but there's padding Return whether or not CTOR is a valid static constant initializer, the same as "initializer_constant_valid_p (CTOR, TREE_TYPE (CTOR)) != 0". */ @@ -7247,12 +7253,12 @@ categorize_ctor_elements_1 (const_tree ctor, HOST_WIDE_INT *p_nz_elts, bool categorize_ctor_elements (const_tree ctor, HOST_WIDE_INT *p_nz_elts, HOST_WIDE_INT *p_unique_nz_elts, - HOST_WIDE_INT *p_init_elts, bool *p_complete) + HOST_WIDE_INT *p_init_elts, int *p_complete) { *p_nz_elts = 0; *p_unique_nz_elts = 0; *p_init_elts = 0; - *p_complete = true; + *p_complete = 1; return categorize_ctor_elements_1 (ctor, p_nz_elts, p_unique_nz_elts, p_init_elts, p_complete); @@ -7313,7 +7319,7 @@ mostly_zeros_p (const_tree exp) if (TREE_CODE (exp) == CONSTRUCTOR) { HOST_WIDE_INT nz_elts, unz_elts, init_elts; - bool complete_p; + int complete_p; categorize_ctor_elements (exp, &nz_elts, &unz_elts, &init_elts, &complete_p); @@ -7331,7 +7337,7 @@ all_zeros_p (const_tree exp) if (TREE_CODE (exp) == CONSTRUCTOR) { HOST_WIDE_INT nz_elts, unz_elts, init_elts; - bool complete_p; + int complete_p; categorize_ctor_elements (exp, &nz_elts, &unz_elts, &init_elts, &complete_p); diff --git a/gcc/expr.h b/gcc/expr.h index 533ae0af3871..04782b15f192 100644 --- a/gcc/expr.h +++ b/gcc/expr.h @@ -361,7 +361,8 @@ extern unsigned HOST_WIDE_INT highest_pow2_factor (const_tree); extern bool categorize_ctor_elements (const_tree, HOST_WIDE_INT *, HOST_WIDE_INT *, HOST_WIDE_INT *, - bool *); + int *); +extern bool type_has_padding_at_level_p (tree); extern bool immediate_const_ctor_p (const_tree, unsigned int words = 1); extern void store_constructor (tree, rtx, int, poly_int64, bool); extern HOST_WIDE_INT int_expr_size (const_tree exp); diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc index 8908e7381e72..5e7fd6460c5d 100644 --- a/gcc/fold-const.cc +++ b/gcc/fold-const.cc @@ -8193,6 +8193,36 @@ native_encode_string (const_tree expr, unsigned char *ptr, int len, int off) return len; } +/* subroutine of native_encode_expr. Encode the CONSTRUCTOR + specified by EXPR into the buffer PTR of length LEN bytes. + Return the number of bytes placed in the buffer, or zero + upon failure. */ + +static int +native_encode_constructor (const_tree expr, unsigned char *ptr, int len, int off) +{ + /* We are only concerned with zero-initialization constructors here. */ + if (CONSTRUCTOR_NELTS (expr)) +return 0; + + /* Wide-char strings are encoded in target byte-order so native + encoding them is trivial. */ + if (BITS_PER_UNIT != CHAR_BIT +
[gcc/aoliva/heads/testbase] (709 commits) Daily bump.
The branch 'aoliva/heads/testbase' was updated to point to: 4d2e8fcdaf32... Daily bump. It previously pointed to: ad642d2c9506... [5/n][PR rtl-optimization/115877] Fix handling of input/out Diff: Summary of changes (added commits): --- 4d2e8fc... Daily bump. (*) d91b6c9... c++: ICE with NSDMIs and fn arguments [PR116015] (*) a247088... s390: Remove vector intrinsics (*) e8a7142... s390: Fix high-level builtins vec_gfmsum{,_accum}_128 (*) a82c4df... Fortran: fix minor frontend GMP leaks (*) edb2712... i386: Optimization for APX NDD is always zero-uppered for s (*) d08a5f2... i386: Optimization for APX NDD is always zero-uppered for l (*) 1b76174... i386: Optimization for APX NDD is always zero-uppered for s (*) a302cd6... i386: Optimization for APX NDD is always zero-uppered for A (*) 42aba47... Restrict pr116202-run-1.c test to riscv_v target (*) 54be14b... Prevent future proc_ptr parsing issues in associate [PR1029 (*) bb23247... Fix ICE in build_function_decl [PR116292] (*) ca7936f... genoutput: Accelerate the place_operands function. (*) e4f9a87... Revert "[rtl-optimization/116244] Don't create bogus regs i (*) 10972e6... testsuite: Fix fam-in-union-alone-in-struct-2.c with unsign (*) c3c83d2... Move ix86_align_loops into a separate pass and insert the p (*) 9045ab7... Daily bump. (*) b13e346... testsuite: Fix struct size check [PR116155] (*) cc00a73... ifcvt: Fix force_operand ICE in noce_convert_multiple_sets (*) 9988d7e... Fortran: reject array constructor value of abstract type [P (*) ccd7068... RISC-V: Fix non-obvious comment typos (*) 5618b02... Internal-fn: Handle vector bool type for type strict match (*) 49d5e21... LRA: Don't emit move for substituted CONSTATNT_P operand [P (*) bee532c... Regenerate avr.opt.urls (*) 3f1e15e... Daily bump. (*) 0451bc5... rs6000: ROP - Do not disable shrink-wrapping for leaf funct (*) ef90a13... RISC-V: Fix missing abi arg in test (*) e9738e7... [rtl-optimization/116244] Don't create bogus regs in alter_ (*) edc47d3... borrowck: Fix debug prints on 32-bits architectures (*) 12028d7... borrowck: Avoid overloading issues on 32bit architectures (*) b219cbe... ifcvt: Handle multiple rewired regs and refactor noce_conve (*) 72c9b5f... ifcvt: Allow more operations in multiple set if conversion (*) 28b3812... ifcvt: handle sequences that clobber flags in noce_convert_ (*) 68da681... AVR: target/85624 - Fix non-matching alignment in clrmem* i (*) 24df2ab... 16-bit testsuite fixes - excessive code size (*) 46bd63d... This fixes problems with tests that exceed a data type or t (*) 40b9a7b... Avoid cfg corruption when using sjlj exceptions where loops (*) 9ab8681... Use splay-tree-utils.h in tree-ssa-sccvn [PR30920] (*) fcc766c... aarch64: Emit ADD X, Y, Y instead of SHL X, Y, #1 for Advan (*) 8d8db21... Fortran: Fix coarray in associate not linking [PR85510] (*) 4bcb480... Initial support for AVX10.2 (*) 7a970bd... PR target/116275: Handle STV of *extenddi2_doubleword_highp (*) 7bf4cd4... LoongArch: Provide ashr lshr and ashl RTL pattern for vecto (*) 0498f8b... LoongArch: Drop vcond{,u} expanders. (*) 75e852b... LoongArch: Use iorn and andn standard pattern names. (*) 9f3b5c2... PR modula2/116181 fix ODR warnings for C/m2 interface libra (*) f09be22... Daily bump. (*) 2b23a44... Fortran: silence Wmaybe-uninitialized warnings for LTO buil (*) 149a23e... AVR: -mlra is not documeted in TEXI. (*) 29a3236... AVR: Add function avr.cc::ra_in_progress(). (*) 19c9ba0... Daily bump. (*) 8035619... i386: testsuite: Adapt fentryname3.c for r14-811 change [PR (*) 331f7d8... i386: testsuite: Add -no-pie for pr113689-1.c [PR70150] (*) 85a6073... Fix reference to the dom walker function in the documentati (*) 16ce781... gm2: add missing debug output guard (*) 9d5c500... testsuite: Fix up sse3-addsubps.c (*) 09a87ea... AVR: ad target/113934 - Add option -mlra to enable LRA. (*) 8cc67b5... c++: inherited CTAD fixes [PR116276] (*) 70da0ca... c++: DECL_UNINSTANTIATED_TEMPLATE_FRIEND_P tweaks (*) cf7feae... c++: clean up cp_identifier_kind checks (*) 6b4b27a... Daily bump. (*) d4e1290... [RISC-V][PR target/116283] Fix split code for recent Zbs im (*) 4734c1b... Revert "lra: emit caller-save register spills before call i (*) 9e4da94... Adjust rangers recomputation depth based on the number of B (*) 5ce3874... Limit equivalency processing in rangers cache. (*) d0bc1cb... btf: Protect BTF_KIND_INFO against invalid kind (*) 786ebbd... c++: Don't accept multiple enum definitions within template (*) 180ede3... RISC-V: Enable stack clash in alloca (*) 2862d99... RISC-V: Add support to vector stack-clash protection (*) b82d173... RISC-V: Stack-clash protection implemention (*) 5694fcf... RISC-V: Move riscv_v_adjust_scalable_frame (*) 0e604d0... RISC-V: Small stack tie changes (*) f91f720... c-family: regenerate c.op
[gcc/aoliva/heads/testme] (171 commits) [libstdc++] [testsuite] avoid async.cc loss of precision [P
The branch 'aoliva/heads/testme' was updated to point to: 9d90ad447ba1... [libstdc++] [testsuite] avoid async.cc loss of precision [P It previously pointed to: 110c93a4411d... [strub] adjust all at-calls type variants at once Diff: !!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST): --- 110c93a... [strub] adjust all at-calls type variants at once Summary of changes (added commits): --- 9d90ad4... [libstdc++] [testsuite] avoid async.cc loss of precision [P ad642d2... [5/n][PR rtl-optimization/115877] Fix handling of input/out (*) ad65caa... [powerpc] [testsuite] reorder dg directives [PR106069] (*) 7c5a9bf... c++/coroutines: correct passing *this to promise type [PR10 (*) 5d2115b... RISC-V: Implement the .SAT_TRUNC for scalar (*) d1b2554... Daily bump. (*) efcbe7b... Fix handling of ICF_NOVOPS in ipa-modref (*) 6f81b7f... c++: Some cp-tree.def comment fixes (*) 1407477... Fix modref's iteraction with store merging (*) 05f0e9e... Add -mcpu=power11 support. (*) ab7c0ae... [4/n][PR rtl-optimization/115877] Correct SUBREG handling i (*) cf8ffc5... Fix modref_eaf_analysis::analyze_ssa_name handling of value (*) 391f46f... Fix accounting of offsets in unadjusted_ptr_and_unit_offset (*) 0d19fbc... Compare loop bounds in ipa-icf (*) 34f33ea... rtl-ssa: Avoid using a stale splay tree root [PR116009] (*) e62988b... rtl-ssa: Add debug routines for def_splay_tree (*) ebde0cc... aarch64: Tighten aarch64_simd_mem_operand_p [PR115969] (*) 88d1619... [NFC][PR rtl-optimization/115877] Avoid setting irrelevant (*) a8e61cd... Fix hash of WIDEN_*_EXPR (*) 1e32a8b... constify inchash (*) 9d6... Fix Rejects allocatable coarray passed as a dummy argument (*) 0c5c0c9... AArch64: implement TARGET_VECTORIZE_CONDITIONAL_OPERATION_I (*) af792f0... middle-end: Implement conditonal store vectorizer pattern [ (*) 913bab2... testsuite: powerpc: fix dg-do run typo (*) 4ab19e4... RISC-V: Rearrange the test helper files for vector .SAT_* (*) 3260665... Daily bump. (*) 838999b... Fortran: Fix regression caused by r14-10477 [PR59104] (*) 9d8ef27... [PR rtl-optimization/115877][2/n] Improve liveness computat (*) 91e468b... [PR rtl-optimization/115877] Fix livein computation for ext (*) 80c3733... gcc: stop adding -fno-common for checking builds (*) 58b78cf... SH: Fix outage caused by recently added 2nd combine pass af (*) 6d811c1... Daily bump. (*) 1824caa... Require bitint575 for pr116003.c (*) 4a46ba2... Revert "Add documentation for musttail attribute" (*) 8805ad2... Revert "Add tests for C/C++ musttail attributes" (*) 53660b1... Revert "C: Implement musttail attribute for returns" (*) ff6994e... Revert "C++: Support clang compatible [[musttail]] (PR83324 (*) 493c555... Output CodeView function information (*) 7357ba2... Add bitint to options for testcase (*) 8e3fef3... doc: Remove documentation of two obsolete spec strings (*) e0d997e... Avoid undefined behaviour in build_option_suggestions (*) 56f824c... Add documentation for musttail attribute (*) 37c4703... Add tests for C/C++ musttail attributes (*) 7db47f7... C: Implement musttail attribute for returns (*) 59dd1d7... C++: Support clang compatible [[musttail]] (PR83324) (*) 5c4c1fe... Add a musttail generic attribute to the c-attribs table (*) 390c3e4... LoongArch: Organize the code related to split move and merg (*) 8d6498f... Daily bump. (*) 01c095a... Check for SSA_NAME not in the IL yet. (*) a95c191... libgomp: Document 'GOMP_teams4' (*) f911994... GCN: Honor OpenMP 5.1 'num_teams' lower bound (*) 3850048... Rewrite usage comment at the top of 'gcc/passes.def' (*) 348d890... Treat boolean vector elements as 0/-1 [PR115406] (*) ebdad26... arm: Update fp16-aapcs-[24].c after insn_propagation patch (*) 2ee70c9... c++: xobj fn call without obj [PR115783] (*) 9116490... AVR: Support new built-in function __builtin_avr_mask1. (*) 8d6994f... libgomp: Remove bogus warnings from privatized-ref-2.f90. (*) c93be16... Fortran: character array constructor with >= 4 constant ele (*) b2f47a5... rs6000: Catch unsupported ABI errors when using -mrop-prote (*) 58a9f3d... c++: add fixed testcase [PR109464] (*) 8fbc386... bpf: create modifier for mem operand for xchg and cmpxchg (*) cea6473... c++: Add [dcl.init.aggr] examples to testsuite (*) a589d3b... Close GCC 11 branch (*) 0f8261e... c++: Hash placeholder constraint in ctp_hasher (*) 02cc849... Match: Only allow single use of MIN_EXPR for SAT_TRUNC form (*) e20ea6b... Daily bump. (*) 9846b09... libatomic: Handle AVX+CX16 ZHAOXIN like Intel for 16b atomi (*) 9690fb3... c++: implement DR1363 and DR1496 for __is_trivial [PR85723] (*) 248e853... libbacktrace: use __has_attribute for fallthrough (*) 6962835... rs6000: Fix .machine cpu selection w/ altivec [PR97367] (*) c192376... rs60
[gcc/aoliva/heads/testbase] (170 commits) [5/n][PR rtl-optimization/115877] Fix handling of input/out
The branch 'aoliva/heads/testbase' was updated to point to: ad642d2c9506... [5/n][PR rtl-optimization/115877] Fix handling of input/out It previously pointed to: bf8e80f9d164... [i386] adjust flag_omit_frame_pointer in a single function Diff: Summary of changes (added commits): --- ad642d2... [5/n][PR rtl-optimization/115877] Fix handling of input/out (*) ad65caa... [powerpc] [testsuite] reorder dg directives [PR106069] (*) 7c5a9bf... c++/coroutines: correct passing *this to promise type [PR10 (*) 5d2115b... RISC-V: Implement the .SAT_TRUNC for scalar (*) d1b2554... Daily bump. (*) efcbe7b... Fix handling of ICF_NOVOPS in ipa-modref (*) 6f81b7f... c++: Some cp-tree.def comment fixes (*) 1407477... Fix modref's iteraction with store merging (*) 05f0e9e... Add -mcpu=power11 support. (*) ab7c0ae... [4/n][PR rtl-optimization/115877] Correct SUBREG handling i (*) cf8ffc5... Fix modref_eaf_analysis::analyze_ssa_name handling of value (*) 391f46f... Fix accounting of offsets in unadjusted_ptr_and_unit_offset (*) 0d19fbc... Compare loop bounds in ipa-icf (*) 34f33ea... rtl-ssa: Avoid using a stale splay tree root [PR116009] (*) e62988b... rtl-ssa: Add debug routines for def_splay_tree (*) ebde0cc... aarch64: Tighten aarch64_simd_mem_operand_p [PR115969] (*) 88d1619... [NFC][PR rtl-optimization/115877] Avoid setting irrelevant (*) a8e61cd... Fix hash of WIDEN_*_EXPR (*) 1e32a8b... constify inchash (*) 9d6... Fix Rejects allocatable coarray passed as a dummy argument (*) 0c5c0c9... AArch64: implement TARGET_VECTORIZE_CONDITIONAL_OPERATION_I (*) af792f0... middle-end: Implement conditonal store vectorizer pattern [ (*) 913bab2... testsuite: powerpc: fix dg-do run typo (*) 4ab19e4... RISC-V: Rearrange the test helper files for vector .SAT_* (*) 3260665... Daily bump. (*) 838999b... Fortran: Fix regression caused by r14-10477 [PR59104] (*) 9d8ef27... [PR rtl-optimization/115877][2/n] Improve liveness computat (*) 91e468b... [PR rtl-optimization/115877] Fix livein computation for ext (*) 80c3733... gcc: stop adding -fno-common for checking builds (*) 58b78cf... SH: Fix outage caused by recently added 2nd combine pass af (*) 6d811c1... Daily bump. (*) 1824caa... Require bitint575 for pr116003.c (*) 4a46ba2... Revert "Add documentation for musttail attribute" (*) 8805ad2... Revert "Add tests for C/C++ musttail attributes" (*) 53660b1... Revert "C: Implement musttail attribute for returns" (*) ff6994e... Revert "C++: Support clang compatible [[musttail]] (PR83324 (*) 493c555... Output CodeView function information (*) 7357ba2... Add bitint to options for testcase (*) 8e3fef3... doc: Remove documentation of two obsolete spec strings (*) e0d997e... Avoid undefined behaviour in build_option_suggestions (*) 56f824c... Add documentation for musttail attribute (*) 37c4703... Add tests for C/C++ musttail attributes (*) 7db47f7... C: Implement musttail attribute for returns (*) 59dd1d7... C++: Support clang compatible [[musttail]] (PR83324) (*) 5c4c1fe... Add a musttail generic attribute to the c-attribs table (*) 390c3e4... LoongArch: Organize the code related to split move and merg (*) 8d6498f... Daily bump. (*) 01c095a... Check for SSA_NAME not in the IL yet. (*) a95c191... libgomp: Document 'GOMP_teams4' (*) f911994... GCN: Honor OpenMP 5.1 'num_teams' lower bound (*) 3850048... Rewrite usage comment at the top of 'gcc/passes.def' (*) 348d890... Treat boolean vector elements as 0/-1 [PR115406] (*) ebdad26... arm: Update fp16-aapcs-[24].c after insn_propagation patch (*) 2ee70c9... c++: xobj fn call without obj [PR115783] (*) 9116490... AVR: Support new built-in function __builtin_avr_mask1. (*) 8d6994f... libgomp: Remove bogus warnings from privatized-ref-2.f90. (*) c93be16... Fortran: character array constructor with >= 4 constant ele (*) b2f47a5... rs6000: Catch unsupported ABI errors when using -mrop-prote (*) 58a9f3d... c++: add fixed testcase [PR109464] (*) 8fbc386... bpf: create modifier for mem operand for xchg and cmpxchg (*) cea6473... c++: Add [dcl.init.aggr] examples to testsuite (*) a589d3b... Close GCC 11 branch (*) 0f8261e... c++: Hash placeholder constraint in ctp_hasher (*) 02cc849... Match: Only allow single use of MIN_EXPR for SAT_TRUNC form (*) e20ea6b... Daily bump. (*) 9846b09... libatomic: Handle AVX+CX16 ZHAOXIN like Intel for 16b atomi (*) 9690fb3... c++: implement DR1363 and DR1496 for __is_trivial [PR85723] (*) 248e853... libbacktrace: use __has_attribute for fallthrough (*) 6962835... rs6000: Fix .machine cpu selection w/ altivec [PR97367] (*) c192376... rs6000, update effective target for tests builtins-10*.c an (*) f7d01e0... libatomic: Improve cpuid usage in __libat_feat1_init (*) 1e60a6a... eh: ICE with std::initializer_list and ASan [PR115865] (*) 5080840... Do not use caller-saved registers for COMDAT fun
[gcc r12-10635] [powerpc] [testsuite] reorder dg directives [PR106069]
https://gcc.gnu.org/g:e142b6607267100537fc7abe6f60a52fc0d8535c commit r12-10635-ge142b6607267100537fc7abe6f60a52fc0d8535c Author: Alexandre Oliva Date: Tue Jul 23 02:19:55 2024 -0300 [powerpc] [testsuite] reorder dg directives [PR106069] The dg-do directive appears after dg-require-effective-target in g++.target/powerpc/pr106069.C. That doesn't work the way that was presumably intended. Both of these directives set dg-do-what, but dg-do does so fully and unconditionally, overriding any decisions recorded there by earlier directives. Reorder the directives more canonically, so that both take effect. for gcc/testsuite/ChangeLog PR target/106069 * g++.target/powerpc/pr106069.C: Reorder dg directives. (cherry picked from commit ad65caa332bc7600caff6b9b5b29175b40d91e67) Diff: --- gcc/testsuite/g++.target/powerpc/pr106069.C | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/g++.target/powerpc/pr106069.C b/gcc/testsuite/g++.target/powerpc/pr106069.C index 537207d2fe83..826379a4479a 100644 --- a/gcc/testsuite/g++.target/powerpc/pr106069.C +++ b/gcc/testsuite/g++.target/powerpc/pr106069.C @@ -1,6 +1,6 @@ +/* { dg-do run } */ /* { dg-options "-O -fno-tree-forwprop -maltivec" } */ /* { dg-require-effective-target vmx_hw } */ -/* { dg-do run } */ typedef __attribute__ ((altivec (vector__))) unsigned native_simd_type;
[gcc r13-8934] [powerpc] [testsuite] reorder dg directives [PR106069]
https://gcc.gnu.org/g:e504184f9175204bc66bf5a95a400bc4685f8ffc commit r13-8934-ge504184f9175204bc66bf5a95a400bc4685f8ffc Author: Alexandre Oliva Date: Tue Jul 23 01:28:00 2024 -0300 [powerpc] [testsuite] reorder dg directives [PR106069] The dg-do directive appears after dg-require-effective-target in g++.target/powerpc/pr106069.C. That doesn't work the way that was presumably intended. Both of these directives set dg-do-what, but dg-do does so fully and unconditionally, overriding any decisions recorded there by earlier directives. Reorder the directives more canonically, so that both take effect. for gcc/testsuite/ChangeLog PR target/106069 * g++.target/powerpc/pr106069.C: Reorder dg directives. (cherry picked from commit ad65caa332bc7600caff6b9b5b29175b40d91e67) Diff: --- gcc/testsuite/g++.target/powerpc/pr106069.C | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/g++.target/powerpc/pr106069.C b/gcc/testsuite/g++.target/powerpc/pr106069.C index 537207d2fe83..826379a4479a 100644 --- a/gcc/testsuite/g++.target/powerpc/pr106069.C +++ b/gcc/testsuite/g++.target/powerpc/pr106069.C @@ -1,6 +1,6 @@ +/* { dg-do run } */ /* { dg-options "-O -fno-tree-forwprop -maltivec" } */ /* { dg-require-effective-target vmx_hw } */ -/* { dg-do run } */ typedef __attribute__ ((altivec (vector__))) unsigned native_simd_type;
[gcc r14-10499] [powerpc] [testsuite] reorder dg directives [PR106069]
https://gcc.gnu.org/g:109b389a0b1528ef7a7c12f0923fb3f5be238f0c commit r14-10499-g109b389a0b1528ef7a7c12f0923fb3f5be238f0c Author: Alexandre Oliva Date: Tue Jul 23 00:44:05 2024 -0300 [powerpc] [testsuite] reorder dg directives [PR106069] The dg-do directive appears after dg-require-effective-target in g++.target/powerpc/pr106069.C. That doesn't work the way that was presumably intended. Both of these directives set dg-do-what, but dg-do does so fully and unconditionally, overriding any decisions recorded there by earlier directives. Reorder the directives more canonically, so that both take effect. for gcc/testsuite/ChangeLog PR target/106069 * g++.target/powerpc/pr106069.C: Reorder dg directives. (cherry picked from commit ad65caa332bc7600caff6b9b5b29175b40d91e67) Diff: --- gcc/testsuite/g++.target/powerpc/pr106069.C | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/g++.target/powerpc/pr106069.C b/gcc/testsuite/g++.target/powerpc/pr106069.C index 537207d2fe83..826379a4479a 100644 --- a/gcc/testsuite/g++.target/powerpc/pr106069.C +++ b/gcc/testsuite/g++.target/powerpc/pr106069.C @@ -1,6 +1,6 @@ +/* { dg-do run } */ /* { dg-options "-O -fno-tree-forwprop -maltivec" } */ /* { dg-require-effective-target vmx_hw } */ -/* { dg-do run } */ typedef __attribute__ ((altivec (vector__))) unsigned native_simd_type;
[gcc r15-2211] [powerpc] [testsuite] reorder dg directives [PR106069]
https://gcc.gnu.org/g:ad65caa332bc7600caff6b9b5b29175b40d91e67 commit r15-2211-gad65caa332bc7600caff6b9b5b29175b40d91e67 Author: Alexandre Oliva Date: Mon Jul 22 23:09:24 2024 -0300 [powerpc] [testsuite] reorder dg directives [PR106069] The dg-do directive appears after dg-require-effective-target in g++.target/powerpc/pr106069.C. That doesn't work the way that was presumably intended. Both of these directives set dg-do-what, but dg-do does so fully and unconditionally, overriding any decisions recorded there by earlier directives. Reorder the directives more canonically, so that both take effect. for gcc/testsuite/ChangeLog PR target/106069 * g++.target/powerpc/pr106069.C: Reorder dg directives. Diff: --- gcc/testsuite/g++.target/powerpc/pr106069.C | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/g++.target/powerpc/pr106069.C b/gcc/testsuite/g++.target/powerpc/pr106069.C index 537207d2fe83..826379a4479a 100644 --- a/gcc/testsuite/g++.target/powerpc/pr106069.C +++ b/gcc/testsuite/g++.target/powerpc/pr106069.C @@ -1,6 +1,6 @@ +/* { dg-do run } */ /* { dg-options "-O -fno-tree-forwprop -maltivec" } */ /* { dg-require-effective-target vmx_hw } */ -/* { dg-do run } */ typedef __attribute__ ((altivec (vector__))) unsigned native_simd_type;
[gcc r14-10433] [alpha] adjust MEM alignment for block move [PR115459]
https://gcc.gnu.org/g:c8fdef7fc25dafc8c7a12727c1046b3c7f2b89bb commit r14-10433-gc8fdef7fc25dafc8c7a12727c1046b3c7f2b89bb Author: Alexandre Oliva Date: Tue Jul 16 08:54:20 2024 -0300 [alpha] adjust MEM alignment for block move [PR115459] Before issuing loads or stores for a block move, adjust the MEM alignments if analysis of the addresses enabled the inference of stricter alignment. This ensures that the MEMs are sufficiently aligned for the corresponding insns, which avoids trouble in case of e.g. substitutions into SUBREGs. for gcc/ChangeLog PR target/115459 * config/alpha/alpha.cc (alpha_expand_block_move): Adjust MEMs to match inferred alignment. (cherry picked from commit ccfe7151803956d178947d0afda0bd66ce097275) Diff: --- gcc/config/alpha/alpha.cc | 12 1 file changed, 12 insertions(+) diff --git a/gcc/config/alpha/alpha.cc b/gcc/config/alpha/alpha.cc index 1126cea1f7ba..e090e74b9d07 100644 --- a/gcc/config/alpha/alpha.cc +++ b/gcc/config/alpha/alpha.cc @@ -3820,6 +3820,12 @@ alpha_expand_block_move (rtx operands[]) else if (a >= 16 && c % 2 == 0) src_align = 16; } + + if (MEM_P (orig_src) && MEM_ALIGN (orig_src) < src_align) + { + orig_src = shallow_copy_rtx (orig_src); + set_mem_align (orig_src, src_align); + } } tmp = XEXP (orig_dst, 0); @@ -3841,6 +3847,12 @@ alpha_expand_block_move (rtx operands[]) else if (a >= 16 && c % 2 == 0) dst_align = 16; } + + if (MEM_P (orig_dst) && MEM_ALIGN (orig_dst) < dst_align) + { + orig_dst = shallow_copy_rtx (orig_dst); + set_mem_align (orig_dst, dst_align); + } } ofs = 0;
[gcc(refs/users/aoliva/heads/testme)] [strub] adjust all at-calls type variants at once
https://gcc.gnu.org/g:110c93a4411dbdaf3581364996a7d9760d1247bd commit 110c93a4411dbdaf3581364996a7d9760d1247bd Author: Alexandre Oliva Date: Tue Jul 16 05:33:07 2024 -0300 [strub] adjust all at-calls type variants at once TYPE_ARG_TYPES of type variants must compare equal, according to verify_type, but adjust_at_calls_type didn't preserve this invariant. Adjust the main type variant and propagate TYPE_ARG_TYPES to all variants. While at that, also adjust the canonical type and its variants, and then verify_type. for gcc/ChangeLog PR c/115848 * ipa-strub.cc (pass_ipa_strub::adjust_at_calls_type_main): Rename from... (pass_ipa_strub::adjust_at_calls_type): ... this. Preserve TYPE_ARG_TYPES across all variants. Adjust TYPE_CANONICAL and verify_type. for gcc/testsuite/ChangeLog PR c/115848 * c-c++-common/strub-pr115848.c: New. * c-c++-common/strub-pr115848-b.c: New. Diff: --- gcc/ipa-strub.cc | 41 +-- gcc/testsuite/c-c++-common/strub-pr115848-b.c | 6 gcc/testsuite/c-c++-common/strub-pr115848.c | 8 ++ 3 files changed, 53 insertions(+), 2 deletions(-) diff --git a/gcc/ipa-strub.cc b/gcc/ipa-strub.cc index 8fa7bdf53002..15d91c994bf8 100644 --- a/gcc/ipa-strub.cc +++ b/gcc/ipa-strub.cc @@ -1891,6 +1891,7 @@ public: #undef DEF_IDENT + static inline int adjust_at_calls_type_main (tree); static inline int adjust_at_calls_type (tree); static inline void adjust_at_calls_call (cgraph_edge *, int, tree); static inline void adjust_at_calls_calls (cgraph_node *); @@ -2348,15 +2349,51 @@ strub_watermark_parm (tree fndecl) gcc_unreachable (); } +/* Adjust a STRUB_AT_CALLS function TYPE and all its variants, + preserving TYPE_ARG_TYPES identity, adding a watermark pointer if + it hasn't been added yet. Return the named argument count. */ + +int +pass_ipa_strub::adjust_at_calls_type (tree type) +{ + gcc_checking_assert (same_strub_mode_in_variants_p (type)); + + tree tmain = TYPE_MAIN_VARIANT (type); + tree orig_types = TYPE_ARG_TYPES (tmain); + gcc_checking_assert (TYPE_ARG_TYPES (type) == orig_types); + int named_args = adjust_at_calls_type_main (tmain); + tree mod_types = TYPE_ARG_TYPES (tmain); + + if (mod_types != orig_types) +for (tree other = TYPE_NEXT_VARIANT (tmain); +other != NULL_TREE; other = TYPE_NEXT_VARIANT (other)) + { + gcc_checking_assert (TYPE_ARG_TYPES (other) == orig_types); + TYPE_ARG_TYPES (other) = mod_types; + } + + if (TYPE_CANONICAL (type) + && TYPE_MAIN_VARIANT (TYPE_CANONICAL (type)) != tmain) +{ + int ret = adjust_at_calls_type (TYPE_CANONICAL (type)); + gcc_checking_assert (named_args == ret); +} + + if (flag_checking) +verify_type (type); + + return named_args; +} + /* Adjust a STRUB_AT_CALLS function TYPE, adding a watermark pointer if it hasn't been added yet. Return the named argument count. */ int -pass_ipa_strub::adjust_at_calls_type (tree type) +pass_ipa_strub::adjust_at_calls_type_main (tree type) { int named_args = 0; - gcc_checking_assert (same_strub_mode_in_variants_p (type)); + gcc_checking_assert (TYPE_MAIN_VARIANT (type) == type); if (!TYPE_ARG_TYPES (type)) return named_args; diff --git a/gcc/testsuite/c-c++-common/strub-pr115848-b.c b/gcc/testsuite/c-c++-common/strub-pr115848-b.c new file mode 100644 index ..9b9e134b3f41 --- /dev/null +++ b/gcc/testsuite/c-c++-common/strub-pr115848-b.c @@ -0,0 +1,6 @@ +/* { dg-skip-if part { *-*-* } } */ +void __attribute__((__strub__)) b(int, int) {} +void c(void); +int main() { + c(); +} diff --git a/gcc/testsuite/c-c++-common/strub-pr115848.c b/gcc/testsuite/c-c++-common/strub-pr115848.c new file mode 100644 index ..158654090721 --- /dev/null +++ b/gcc/testsuite/c-c++-common/strub-pr115848.c @@ -0,0 +1,8 @@ +/* { dg-do link } */ +/* { dg-require-effective-target lto } */ +/* { dg-options "-flto" } */ +/* { dg-additional-sources "strub-pr115848-b.c" } */ + +typedef void __attribute__((__strub__)) a(int, int); +a(b); +void c() { b(0, 0); }