Go patch committed: Handle f().x if f returns a zero-sized type
This patch to the Go frontend handles the case of f().x when the function f returns a zero-sized type. In that case the GCC interface will have changed f to return void, as the GCC middle-end does not have complete support for zero-sized types. This patch handles the case of void when in a struct field expression. The test case for this is https://go.dev/cl/417874. This fixes https://go.dev/issue/23870. Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu. Committed to mainline. Ian * go-gcc.cc (Gcc_backend::struct_field_expression): Handle a void expression, as for f().x where f returns a zero-sized type. 2b7b330427a60c8a5ef00d940adde0160ce04f27 diff --git a/gcc/go/go-gcc.cc b/gcc/go/go-gcc.cc index 7b4b2adb058..1ba7206caeb 100644 --- a/gcc/go/go-gcc.cc +++ b/gcc/go/go-gcc.cc @@ -1707,6 +1707,13 @@ Gcc_backend::struct_field_expression(Bexpression* bstruct, size_t index, if (struct_tree == error_mark_node || TREE_TYPE(struct_tree) == error_mark_node) return this->error_expression(); + + // A function call that returns a zero-sized object will have been + // changed to return void. A zero-sized object can have a + // (zero-sized) field, so support that case. + if (TREE_TYPE(struct_tree) == void_type_node) +return bstruct; + gcc_assert(TREE_CODE(TREE_TYPE(struct_tree)) == RECORD_TYPE); tree field = TYPE_FIELDS(TREE_TYPE(struct_tree)); if (field == NULL_TREE)
[x86_64 PATCH] PR target/106231: Optimize (any_extend:DI (ctz:SI ...)).
This patch resolves PR target/106231 by providing insns that recognize (zero_extend:DI (ctz:SI ...)) and (sign_extend:DI (ctz:SI ...)). The result of ctz:SI is always between 0 and 32 (or undefined), so sign_extension is the same as zero_extension, and the result is already extended in the destination register. Things are a little complicated, because the existing implementation of *ctzsi2 handles multiple cases, including false dependencies, which we continue to support in this patch. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check with no new failures. Ok for mainline? 2022-07-16 Roger Sayle gcc/ChangeLog PR target/106231 * config/i386/i386.md (*ctzsidi2_ext): New insn_and_split to recognize any_extend:DI of ctz:SI which is implicitly extended. (*ctzsidi2_ext_falsedep): New define_insn to model a DImode extended ctz:SI that has preceding xor to break false dependency. gcc/testsuite/ChangeLog PR target/106231 * gcc.target/i386/pr106231-1.c: New test case. * gcc.target/i386/pr106231-2.c: New test case. Thanks in advance, Roger -- diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 3b02d0c..164b0c2 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -16431,6 +16431,66 @@ (set_attr "prefix_rep" "1") (set_attr "mode" "SI")]) +(define_insn_and_split "*ctzsidi2_ext" + [(set (match_operand:DI 0 "register_operand" "=r") + (any_extend:DI + (ctz:SI + (match_operand:SI 1 "nonimmediate_operand" "rm" + (clobber (reg:CC FLAGS_REG))] + "TARGET_64BIT" +{ + if (TARGET_BMI) +return "tzcnt{l}\t{%1, %k0|%k0, %1}"; + else if (TARGET_CPU_P (GENERIC) + && !optimize_function_for_size_p (cfun)) +/* tzcnt expands to 'rep bsf' and we can use it even if !TARGET_BMI. */ +return "rep%; bsf{l}\t{%1, %k0|%k0, %1}"; + return "bsf{l}\t{%1, %k0|%k0, %1}"; +} + "(TARGET_BMI || TARGET_CPU_P (GENERIC)) + && TARGET_AVOID_FALSE_DEP_FOR_BMI && epilogue_completed + && optimize_function_for_speed_p (cfun) + && !reg_mentioned_p (operands[0], operands[1])" + [(parallel +[(set (match_dup 0) + (any_extend:DI (ctz:SI (match_dup 1 + (unspec [(match_dup 0)] UNSPEC_INSN_FALSE_DEP) + (clobber (reg:CC FLAGS_REG))])] + "ix86_expand_clear (operands[0]);" + [(set_attr "type" "alu1") + (set_attr "prefix_0f" "1") + (set (attr "prefix_rep") + (if_then_else + (ior (match_test "TARGET_BMI") + (and (not (match_test "optimize_function_for_size_p (cfun)")) +(match_test "TARGET_CPU_P (GENERIC)"))) + (const_string "1") + (const_string "0"))) + (set_attr "mode" "SI")]) + +(define_insn "*ctzsidi2_ext_falsedep" + [(set (match_operand:DI 0 "register_operand" "=r") + (any_extend:DI + (ctz:SI + (match_operand:SI 1 "nonimmediate_operand" "rm" + (unspec [(match_operand:DI 2 "register_operand" "0")] + UNSPEC_INSN_FALSE_DEP) + (clobber (reg:CC FLAGS_REG))] + "TARGET_64BIT" +{ + if (TARGET_BMI) +return "tzcnt{l}\t{%1, %k0|%k0, %1}"; + else if (TARGET_CPU_P (GENERIC)) +/* tzcnt expands to 'rep bsf' and we can use it even if !TARGET_BMI. */ +return "rep%; bsf{l}\t{%1, %k0|%k0, %1}"; + else +gcc_unreachable (); +} + [(set_attr "type" "alu1") + (set_attr "prefix_0f" "1") + (set_attr "prefix_rep" "1") + (set_attr "mode" "SI")]) + (define_insn "bsr_rex64" [(set (reg:CCZ FLAGS_REG) (compare:CCZ (match_operand:DI 1 "nonimmediate_operand" "rm") diff --git a/gcc/testsuite/gcc.target/i386/pr106231-1.c b/gcc/testsuite/gcc.target/i386/pr106231-1.c new file mode 100644 index 000..d17297f --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr106231-1.c @@ -0,0 +1,8 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2 -mtune=generic" } */ +long long +foo(long long x, unsigned bits) +{ + return x + (unsigned) __builtin_ctz(bits); +} +/* { dg-final { scan-assembler-not "cltq" } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr106231-2.c b/gcc/testsuite/gcc.target/i386/pr106231-2.c new file mode 100644 index 000..fd3a8e3 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr106231-2.c @@ -0,0 +1,8 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2 -mtune=ivybridge" } */ +long long +foo(long long x, unsigned bits) +{ + return x + (unsigned) __builtin_ctz(bits); +} +/* { dg-final { scan-assembler-not "cltq" } } */
[AVX512 PATCH] Add UNSPEC_MASKOP to kupck instructions in sse.md.
This AVX512 specific patch to sse.md is split out from an earlier patch: https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596199.html The new splitters proposed in that patch interfere with AVX512's kunpckdq instruction which is defined as identical RTL, DW:DI = (HI:SI<<32)|zero_extend(LO:SI). To distinguish these, and avoid AVX512 mask registers accidentally being (ab)used by reload to perform SImode scalar shifts, this patch adds the explicit (unspec UNSPEC_MASKOP) to the unpack mask operations, which matches what sse.md does for the other mask specific (logic) operations. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2022-07-16 Roger Sayle gcc/ChangeLog * config/i386/sse.md (kunpckhi): Add UNSPEC_MASKOP unspec. (kunpcksi): Likewise, add UNSPEC_MASKOP unspec. (kunpckdi): Likewise, add UNSPEC_MASKOP unspec. (vec_pack_trunc_qi): Update to specify required UNSPEC_MASKOP unspec. (vec_pack_trunc_): Likewise. Thanks in advance, Roger -- diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 62688f8..da50ffa 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -2072,7 +2072,8 @@ (ashift:HI (zero_extend:HI (match_operand:QI 1 "register_operand" "k")) (const_int 8)) - (zero_extend:HI (match_operand:QI 2 "register_operand" "k"] + (zero_extend:HI (match_operand:QI 2 "register_operand" "k" + (unspec [(const_int 0)] UNSPEC_MASKOP)] "TARGET_AVX512F" "kunpckbw\t{%2, %1, %0|%0, %1, %2}" [(set_attr "mode" "HI") @@ -2085,7 +2086,8 @@ (ashift:SI (zero_extend:SI (match_operand:HI 1 "register_operand" "k")) (const_int 16)) - (zero_extend:SI (match_operand:HI 2 "register_operand" "k"] + (zero_extend:SI (match_operand:HI 2 "register_operand" "k" + (unspec [(const_int 0)] UNSPEC_MASKOP)] "TARGET_AVX512BW" "kunpckwd\t{%2, %1, %0|%0, %1, %2}" [(set_attr "mode" "SI")]) @@ -2096,7 +2098,8 @@ (ashift:DI (zero_extend:DI (match_operand:SI 1 "register_operand" "k")) (const_int 32)) - (zero_extend:DI (match_operand:SI 2 "register_operand" "k"] + (zero_extend:DI (match_operand:SI 2 "register_operand" "k" + (unspec [(const_int 0)] UNSPEC_MASKOP)] "TARGET_AVX512BW" "kunpckdq\t{%2, %1, %0|%0, %1, %2}" [(set_attr "mode" "DI")]) @@ -17400,21 +17403,26 @@ }) (define_expand "vec_pack_trunc_qi" - [(set (match_operand:HI 0 "register_operand") - (ior:HI (ashift:HI (zero_extend:HI (match_operand:QI 2 "register_operand")) - (const_int 8)) - (zero_extend:HI (match_operand:QI 1 "register_operand"] + [(parallel +[(set (match_operand:HI 0 "register_operand") + (ior:HI + (ashift:HI (zero_extend:HI (match_operand:QI 2 "register_operand")) + (const_int 8)) + (zero_extend:HI (match_operand:QI 1 "register_operand" + (unspec [(const_int 0)] UNSPEC_MASKOP)])] "TARGET_AVX512F") (define_expand "vec_pack_trunc_" - [(set (match_operand: 0 "register_operand") - (ior: - (ashift: + [(parallel +[(set (match_operand: 0 "register_operand") + (ior: + (ashift: + (zero_extend: + (match_operand:SWI24 2 "register_operand")) + (match_dup 3)) (zero_extend: - (match_operand:SWI24 2 "register_operand")) - (match_dup 3)) - (zero_extend: - (match_operand:SWI24 1 "register_operand"] + (match_operand:SWI24 1 "register_operand" + (unspec [(const_int 0)] UNSPEC_MASKOP)])] "TARGET_AVX512BW" { operands[3] = GEN_INT (GET_MODE_BITSIZE (mode));
[middle-end PATCH] PR c/106264: Silence warnings from __builtin_modf et al.
This middle-end patch resolves PR c/106264 which is a spurious warning regression caused by the tree-level expansion of modf, frexp and remquo producing "expression has no-effect" when the built-in function's result is ignored. When these built-ins were first expanded at tree-level, fold_builtin_n would blindly set TREE_NO_WARNING for all built-ins. Now that we're more discerning, we should precisely set TREE_NO_WARNING selectively on those COMPOUND_EXPRs that need them. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check with no new failures. Ok for mainline? 2022-07-16 Roger Sayle gcc/ChangeLog PR c/106264 * builtins.cc (fold_builtin_frexp): Set TREE_NO_WARNING on COMPOUND_EXPR to silence spurious warning if result isn't used. (fold_builtin_modf): Likewise. (do_mpfr_remquo): Likewise. gcc/testsuite/ChangeLog PR c/106264 * gcc.dg/pr106264.c: New test case. Thanks in advance, Roger -- diff --git a/gcc/builtins.cc b/gcc/builtins.cc index 35b9197..c745777 100644 --- a/gcc/builtins.cc +++ b/gcc/builtins.cc @@ -8625,7 +8625,7 @@ fold_builtin_frexp (location_t loc, tree arg0, tree arg1, tree rettype) if (TYPE_MAIN_VARIANT (TREE_TYPE (arg1)) == integer_type_node) { const REAL_VALUE_TYPE *const value = TREE_REAL_CST_PTR (arg0); - tree frac, exp; + tree frac, exp, res; switch (value->cl) { @@ -8656,7 +8656,9 @@ fold_builtin_frexp (location_t loc, tree arg0, tree arg1, tree rettype) /* Create the COMPOUND_EXPR (*arg1 = trunc, frac). */ arg1 = fold_build2_loc (loc, MODIFY_EXPR, rettype, arg1, exp); TREE_SIDE_EFFECTS (arg1) = 1; - return fold_build2_loc (loc, COMPOUND_EXPR, rettype, arg1, frac); + res = fold_build2_loc (loc, COMPOUND_EXPR, rettype, arg1, frac); + TREE_NO_WARNING (res) = 1; + return res; } return NULL_TREE; @@ -8682,6 +8684,7 @@ fold_builtin_modf (location_t loc, tree arg0, tree arg1, tree rettype) { const REAL_VALUE_TYPE *const value = TREE_REAL_CST_PTR (arg0); REAL_VALUE_TYPE trunc, frac; + tree res; switch (value->cl) { @@ -8711,8 +8714,10 @@ fold_builtin_modf (location_t loc, tree arg0, tree arg1, tree rettype) arg1 = fold_build2_loc (loc, MODIFY_EXPR, rettype, arg1, build_real (rettype, trunc)); TREE_SIDE_EFFECTS (arg1) = 1; - return fold_build2_loc (loc, COMPOUND_EXPR, rettype, arg1, - build_real (rettype, frac)); + res = fold_build2_loc (loc, COMPOUND_EXPR, rettype, arg1, +build_real (rettype, frac)); + TREE_NO_WARNING (res) = 1; + return res; } return NULL_TREE; @@ -10673,8 +10678,10 @@ do_mpfr_remquo (tree arg0, tree arg1, tree arg_quo) integer_quo)); TREE_SIDE_EFFECTS (result_quo) = 1; /* Combine the quo assignment with the rem. */ - result = non_lvalue (fold_build2 (COMPOUND_EXPR, type, - result_quo, result_rem)); + result = fold_build2 (COMPOUND_EXPR, type, + result_quo, result_rem); + TREE_NO_WARNING (result) = 1; + result = non_lvalue (result); } } } diff --git a/gcc/testsuite/gcc.dg/pr106264.c b/gcc/testsuite/gcc.dg/pr106264.c new file mode 100644 index 000..6b4af49 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr106264.c @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -Wall" } */ +double frexp (double, int*); +double modf (double, double*); +double remquo (double, double, int*); + +int f (void) +{ + int y; + frexp (1.0, ); + return y; +} + +double g (void) +{ + double y; + modf (1.0, ); + return y; +} + +int h (void) +{ + int y; + remquo (1.0, 1.0, ); + return y; +} +
[x86 PATCH] Fix issue with x86_64_const_vector_operand predicate.
This patch fixes (what I believe is) a latent bug in i386.md's x86_64_const_vector_operand define_predicate. According to the documentation, when a predicate is called with rtx operand OP and machine_mode operand MODE, we can't shouldn't assume that the MODE is (or has been checked to be) GET_MODE (OP). The failure mode is that recog can call x86_64_const_vector_operand on an arbitrary CONST_VECTOR passing a MODE of V2QI_mode, but when the CONST_VECTOR is in fact V1TImode, it's unsafe to directly call ix86_convert_const_vector_to_integer, which assumes that the CONST_VECTOR contains CONST_INTs when it actually contains CONST_WIDE_INTs. The checks in this define_predicate need to be testing OP's mode, and ideally confirming that this matches the passed in/specified MODE. This bug is currently latent, but adding an innocent/unrelated define_insn, such as "(set (reg:CCC FLAGS_REG) (const_int 0))" to i386.md can occasionally change the order in which genrecog generates its tests, then ICEing during bootstrap due to V1TI CONST_VECTORs. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target-board=unix{-m32}, with no new failures. Ok for mainline? 2022-07-16 Roger Sayle gcc/ChangeLog * config/i386/predicates.md (x86_64_const_vector_operand): Check the operand's mode matches the specified mode argument. Thanks in advance, Roger -- diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md index c71c453..42053ea 100644 --- a/gcc/config/i386/predicates.md +++ b/gcc/config/i386/predicates.md @@ -1199,6 +1199,10 @@ (define_predicate "x86_64_const_vector_operand" (match_code "const_vector") { + if (mode == VOIDmode) +mode = GET_MODE (op); + else if (GET_MODE (op) != mode) +return false; if (GET_MODE_SIZE (mode) > UNITS_PER_WORD) return false; HOST_WIDE_INT val = ix86_convert_const_vector_to_integer (op, mode);
Re: [PATCH 2/2] xtensa: Optimize "bitwise AND with imm1" followed by "branch if (not) equal to imm2"
On Fri, Jul 15, 2022 at 4:17 PM Takayuki 'January June' Suwa wrote: > > This patch enhances the effectiveness of the previously posted one: > "xtensa: Optimize bitwise AND operation with some specific forms of > constants". > > /* example */ > extern void foo(int); > void test(int a) { > if ((a & (-1U << 8)) == (128 << 8)) /* 0 or one of "b4const" */ > foo(a); > } > > ;; before > .global test > test: > movia3, -0x100 > movi.n a4, 1 > and a3, a2, a3 > sllia4, a4, 15 > bne a3, a4, .L3 > j.l foo, a9 > .L1: > ret.n > > ;; after > .global test > test: > srlia3, a2, 8 > bneia3, 128, .L1 > j.l foo, a9 > .L1: > ret.n > > gcc/ChangeLog: > > * config/xtensa/xtensa.md > (*masktrue_const_pow2_minus_one, *masktrue_const_negative_pow2, > *masktrue_const_shifted_mask): If the immediate for bitwise AND is > represented as '-(1 << N)', decrease the lower bound of N from 12 > to 1. And the other immediate for conditional branch is now no > longer limited to zero, but also one of some positive integers. > Finally, remove the checks of some conditions, because the comparison > expressions that don't satisfy such checks are determined as > compile-time constants and thus will be optimized away before > RTL expansion. > --- > gcc/config/xtensa/xtensa.md | 73 ++--- > 1 file changed, 44 insertions(+), 29 deletions(-) Regtested for target=xtensa-linux-uclibc, no new regressions. Committed to master. -- Thanks. -- Max
Re: [PATCH 1/2] xtensa: constantsynth: Make try to find shorter instruction
On Fri, Jul 15, 2022 at 4:17 PM Takayuki 'January June' Suwa wrote: > > This patch allows the constant synthesis to choose shorter instruction > if possible. > > /* example */ > int test(void) { > return 128 << 8; > } > > ;; before > test: > movia2, 0x100 > addmi a2, a2, 0x7f00 > ret.n > > ;; after > test: > movi.n a2, 1 > sllia2, a2, 15 > ret.n > > When the Code Density Option is configured, the latter is one byte smaller > than the former. > > gcc/ChangeLog: > > * config/xtensa/xtensa.cc (xtensa_emit_constantsynth): Remove. > (xtensa_constantsynth_2insn): Change to try all three synthetic > methods and to use the one that fits the immediate value of > the seed into a Narrow Move Immediate instruction "MOVI.N" > when the Code Density Option is configured. > --- > gcc/config/xtensa/xtensa.cc | 58 ++--- > 1 file changed, 29 insertions(+), 29 deletions(-) Regtested for target=xtensa-linux-uclibc, no new regressions. Committed to master. -- Thanks. -- Max