[Bug testsuite/109549] [14 Regression] Conditional move regressions after r14-53-g675b1a7f113adb1d737adaf78b4fd90be7a0ed1a
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109549 --- Comment #13 from Stefan Schulze Frielinghaus --- I will take it and I've already prepared a patch. Currently, I'm still testing the patch. I hope I get enough compute resources in order to make it into GCC 14. Anyhow, you can assign the issue to me (I think I don't have permissions to do it myself).
[Bug testsuite/109549] [14 Regression] Conditional move regressions after r14-53-g675b1a7f113adb1d737adaf78b4fd90be7a0ed1a
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109549 --- Comment #11 from Stefan Schulze Frielinghaus --- I will have a look at those s390x failures and come up with a TARGET_NOCE_CONVERSION_PROFITABLE_P implementation.
[Bug target/113994] [13/14 Regression] Probable C++ code generation bug with -O2 on s390x platform
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113994 --- Comment #6 from Stefan Schulze Frielinghaus --- Looks like wrong liveness information. The problem is that df_analyze_loop only considers basic blocks which strictly belong to a loop which is not enough. During loop2_doloop basic block 9 (previously 8) embodies the CC consumer jump_insn 42 and is not part of the loop and therefore does not contribute to the liveness analysis. A quick and dirty experiment by forcing a merge with BB 9 diff --git a/gcc/df-core.cc b/gcc/df-core.cc index f0eb4c93957..79f37e22ec1 100644 --- a/gcc/df-core.cc +++ b/gcc/df-core.cc @@ -957,9 +957,11 @@ df_worklist_propagate_backward (struct dataflow *dataflow, if (EDGE_COUNT (bb->succs) > 0) FOR_EACH_EDGE (e, ei, bb->succs) { - if (bbindex_to_postorder[e->dest->index] < last_change_age.length () + if ((bbindex_to_postorder[e->dest->index] < last_change_age.length () && age <= last_change_age[bbindex_to_postorder[e->dest->index]] && bitmap_bit_p (considered, e->dest->index)) + || (strcmp ("loop2_doloop", current_pass->name) == 0 + && e->src->index == 6 && e->dest->index == 9)) changed |= dataflow->problem->con_fun_n (e); } else if (dataflow->problem->con_fun_0) shows that, now, CC is live at BB 6 and therefore doloop performs no transformation due to bool fail = bitmap_intersect_p (df_get_live_out (loop_end), modified); BITMAP_FREE (modified); if (fail) { if (dump_file) fprintf (dump_file, "Doloop: doloop pattern clobbers live out\n"); return false; } In a first try I enlarged the set of basic blocks for which df_analyze_loop is run to also include basic blocks which have a direct edge originating from a basic block of a loop. Of course, this solves this problem. However, in general this may not be enough. I'm wondering what the IL allows. Is it possible to have a graph containing not only outgoing edges of a loop but also ingoing? If so I think we would need to compute the set of basic blocks which are reachable from within the loop. Any thoughts?
[Bug target/113960] std::map with std::vector as input overwrites itself with c++20, on s390x platform
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113960 --- Comment #9 from Stefan Schulze Frielinghaus --- (In reply to Jonathan Wakely from comment #7) > We can't use memcmp if the sizes are different. We don't want to use the > min, we want to guard that code with the sizes being the same, then we can > just use len*sizeof(*first1) because we know it's the same as > sizeof(*first2). Hehe I was about to add another comment. I just confused myself with taking the minimum but we rather need to ensure that we are walking over same sized integers. LGTM
[Bug target/113960] std::map with std::vector as input overwrites itself with c++20, on s390x platform
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113960 --- Comment #6 from Stefan Schulze Frielinghaus --- Guard __is_byte_iter checks for contiguous bytes which I guess is fine for std::vector and then checks for __is_memcmp_ordered which is fine for big-endian targets in conjunction with unsigned integers. From cpp_type_traits.h we have: // Whether memcmp can be used to determine ordering for a type // e.g. in std::lexicographical_compare or three-way comparisons. // True for unsigned integer-like types where comparing each byte in turn // as an unsigned char yields the right result. This is true for all // unsigned integers on big endian targets, but only unsigned narrow // character types (and std::byte) on little endian targets. template::__value #else __is_byte<_Tp>::__value #endif Thus using memcmp here is fine, however, I'm still a bit unsure whether we really have to take the minimum of *__first1 and *__first2 since I haven't found any size-relation between those types.
[Bug target/113960] std::map with std::vector as input overwrites itself with c++20, on s390x platform
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113960 Stefan Schulze Frielinghaus changed: What|Removed |Added CC||jwakely at redhat dot com --- Comment #4 from Stefan Schulze Frielinghaus --- While giving it a second thought maybe something like const auto __len_bytes = __len * std::min (sizeof (*__first1), sizeof (*__first2)); would be more appropriate since AFAICT the types _InputIter1 and _InputIter2 are not related to each other w.r.t. to their pointed size. Maybe Jonathan can shed some light on this?
[Bug target/113960] std::map with std::vector as input overwrites itself with c++20, on s390x platform
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113960 --- Comment #3 from Stefan Schulze Frielinghaus --- This seems to be a bug in the three way comparison introduced with C++20. The bug happens while deciding whether key v2 already exists in the map or not. template constexpr auto lexicographical_compare_three_way(_InputIter1 __first1, _InputIter1 __last1, _InputIter2 __first2, _InputIter2 __last2, _Comp __comp) -> decltype(__comp(*__first1, *__first2)) { // concept requirements __glibcxx_function_requires(_InputIteratorConcept<_InputIter1>) __glibcxx_function_requires(_InputIteratorConcept<_InputIter2>) __glibcxx_requires_valid_range(__first1, __last1); __glibcxx_requires_valid_range(__first2, __last2); using _Cat = decltype(__comp(*__first1, *__first2)); static_assert(same_as, _Cat>); if (!std::__is_constant_evaluated()) if constexpr (same_as<_Comp, __detail::_Synth3way> || same_as<_Comp, compare_three_way>) if constexpr (__is_byte_iter<_InputIter1>) if constexpr (__is_byte_iter<_InputIter2>) { const auto [__len, __lencmp] = _GLIBCXX_STD_A:: __min_cmp(__last1 - __first1, __last2 - __first2); if (__len) { const auto __c = __builtin_memcmp(&*__first1, &*__first2, __len) <=> 0; if (__c != 0) return __c; } return __lencmp; } __len equals 1 since both vectors have length 1. However, memcmp should be called with the number of bytes and not the number of elements of the vector. That means memcmp is called with two pointers to MEMs of unsigned shorts 1 and 2 where the high-bytes equal 0 and therefore memcmp returns with 0 on big-endian targets. Ultimately __lencmp is returned which itself equals std::strong_ordering::equal rendering v2 replacing v1. Fixed by diff --git a/libstdc++-v3/include/bits/stl_algobase.h b/libstdc++-v3/include/bits/stl_algobase.h index d534e02871f..6ebece315f7 100644 --- a/libstdc++-v3/include/bits/stl_algobase.h +++ b/libstdc++-v3/include/bits/stl_algobase.h @@ -1867,8 +1867,10 @@ _GLIBCXX_BEGIN_NAMESPACE_ALGO __min_cmp(__last1 - __first1, __last2 - __first2); if (__len) { + const auto __len_bytes = __len * sizeof (*first1); const auto __c - = __builtin_memcmp(&*__first1, &*__first2, __len) <=> 0; + = __builtin_memcmp(&*__first1, &*__first2, __len_bytes) + <=> 0; if (__c != 0) return __c; } Can you give the patch a try?
[Bug testsuite/111462] [14 regression] gcc.dg/tree-ssa/ssa-sink-18.c fails after r14-4089-gd45ddc2c04e471
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111462 --- Comment #6 from Stefan Schulze Frielinghaus --- (In reply to Richard Biener from comment #5) > (In reply to Stefan Schulze Frielinghaus from comment #4) > > Since r14-4089-gd45ddc2c04e471 bootstrap fails on s390 with > > > > /devel/gcc/build/./prev-gcc/xg++ -B/devel/gcc/build/./prev-gcc/ > > -B/devel/gcc/dst/s390x-ibm-linux-gnu/bin/ -nostdinc++ > > -B/devel/gcc/build/prev-s390x-ibm-linux-gnu/libstdc++-v3/src/.libs > > -B/devel/gcc/build/prev-s390x-ibm-linux-gnu/libstdc++-v3/libsupc++/.libs > > -I/devel/gcc/build/prev-s390x-ibm-linux-gnu/libstdc++-v3/include/s390x-ibm- > > linux-gnu -I/devel/gcc/build/prev-s390x-ibm-linux-gnu/libstdc++-v3/include > > -I/devel/gcc/src/libstdc++-v3/libsupc++ > > -L/devel/gcc/build/prev-s390x-ibm-linux-gnu/libstdc++-v3/src/.libs > > -L/devel/gcc/build/prev-s390x-ibm-linux-gnu/libstdc++-v3/libsupc++/.libs > > -I/devel/gcc/src/libcpp -I. -I/devel/gcc/src/libcpp/../include > > -I/devel/gcc/src/libcpp/include -g -O2 -fchecking=1 -W -Wall -Wno-narrowing > > -Wwrite-strings -Wmissing-format-attribute -pedantic -Wno-long-long -Werror > > -fno-exceptions -fno-rtti -I/devel/gcc/src/libcpp -I. > > -I/devel/gcc/src/libcpp/../include -I/devel/gcc/src/libcpp/include-c -o > > line-map.o -MT line-map.o -MMD -MP -MF .deps/line-map.Tpo > > /devel/gcc/src/libcpp/line-map.cc > > /devel/gcc/src/libcpp/line-map.cc: In function 'int > > linemap_compare_locations(line_maps*, location_t, location_t)': > > /devel/gcc/src/libcpp/line-map.cc:1434:1: error: statement uses released SSA > > name > > 1434 | linemap_compare_locations (line_maps *set, > > | ^ > > _219 = _216; > > The use of _216 should have been replaced > > during GIMPLE pass: dom > > /devel/gcc/src/libcpp/line-map.cc:1434:1: internal compiler error: cannot > > update SSA form > > > > If you think it might be helpful to reduce line-map.cc then just give me a > > ping. > > that's PR111465, fixed by r14-4128-g564ecb7d5afb0b hopefully Right, fixed on current trunk. Thanks!
[Bug testsuite/111462] [14 regression] gcc.dg/tree-ssa/ssa-sink-18.c fails after r14-4089-gd45ddc2c04e471
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111462 Stefan Schulze Frielinghaus changed: What|Removed |Added CC||stefansf at linux dot ibm.com --- Comment #4 from Stefan Schulze Frielinghaus --- Since r14-4089-gd45ddc2c04e471 bootstrap fails on s390 with /devel/gcc/build/./prev-gcc/xg++ -B/devel/gcc/build/./prev-gcc/ -B/devel/gcc/dst/s390x-ibm-linux-gnu/bin/ -nostdinc++ -B/devel/gcc/build/prev-s390x-ibm-linux-gnu/libstdc++-v3/src/.libs -B/devel/gcc/build/prev-s390x-ibm-linux-gnu/libstdc++-v3/libsupc++/.libs -I/devel/gcc/build/prev-s390x-ibm-linux-gnu/libstdc++-v3/include/s390x-ibm-linux-gnu -I/devel/gcc/build/prev-s390x-ibm-linux-gnu/libstdc++-v3/include -I/devel/gcc/src/libstdc++-v3/libsupc++ -L/devel/gcc/build/prev-s390x-ibm-linux-gnu/libstdc++-v3/src/.libs -L/devel/gcc/build/prev-s390x-ibm-linux-gnu/libstdc++-v3/libsupc++/.libs -I/devel/gcc/src/libcpp -I. -I/devel/gcc/src/libcpp/../include -I/devel/gcc/src/libcpp/include -g -O2 -fchecking=1 -W -Wall -Wno-narrowing -Wwrite-strings -Wmissing-format-attribute -pedantic -Wno-long-long -Werror -fno-exceptions -fno-rtti -I/devel/gcc/src/libcpp -I. -I/devel/gcc/src/libcpp/../include -I/devel/gcc/src/libcpp/include-c -o line-map.o -MT line-map.o -MMD -MP -MF .deps/line-map.Tpo /devel/gcc/src/libcpp/line-map.cc /devel/gcc/src/libcpp/line-map.cc: In function 'int linemap_compare_locations(line_maps*, location_t, location_t)': /devel/gcc/src/libcpp/line-map.cc:1434:1: error: statement uses released SSA name 1434 | linemap_compare_locations (line_maps *set, | ^ _219 = _216; The use of _216 should have been replaced during GIMPLE pass: dom /devel/gcc/src/libcpp/line-map.cc:1434:1: internal compiler error: cannot update SSA form If you think it might be helpful to reduce line-map.cc then just give me a ping.
[Bug rtl-optimization/110939] [14 Regression] 14.0 ICE at rtl.h:2297 while bootstrapping on loongarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110939 --- Comment #11 from Stefan Schulze Frielinghaus --- https://gcc.gnu.org/pipermail/gcc-patches/2023-August/627024.html
[Bug rtl-optimization/110867] [14 Regression] ICE in combine after 7cdd0860949c6c3232e6cff1d7ca37bb5234074c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110867 --- Comment #9 from Stefan Schulze Frielinghaus --- It looks like as if the first fix didn't entirely solve the problem. It turns out that the normal form of const_int is not always met. Before releasing a new patch, could you test it first in order to make sure that I do not break bootstrapping again. I already gave it a try against the reproducer but would like to make sure that the whole bootstrap is successful.
[Bug rtl-optimization/110867] [14 Regression] ICE in combine after 7cdd0860949c6c3232e6cff1d7ca37bb5234074c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110867 --- Comment #8 from Stefan Schulze Frielinghaus --- Created attachment 55716 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55716&action=edit Really fix narrow comparison
[Bug rtl-optimization/110939] [14 Regression] 14.0 ICE at rtl.h:2297 while bootstrapping on loongarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110939 --- Comment #9 from Stefan Schulze Frielinghaus --- Thanks for the reproducer and sorry for the hassle. The normal form of a constant for a mode with fewer bits than in HOST_WIDE_INT is a sign extended version of the original constant. This even holds for unsigned constants which I missed. The following should fix this: diff --git a/gcc/combine.cc b/gcc/combine.cc index e46d202d0a7..9e5bf96a09d 100644 --- a/gcc/combine.cc +++ b/gcc/combine.cc @@ -12059,7 +12059,7 @@ simplify_compare_const (enum rtx_code code, machine_mode mode, : (GET_MODE_SIZE (int_mode) - GET_MODE_SIZE (narrow_mode_iter))); *pop0 = adjust_address_nv (op0, narrow_mode_iter, offset); - *pop1 = GEN_INT (n); + *pop1 = gen_int_mode (n, narrow_mode_iter); return adjusted_code; } } Can you give this a try?
[Bug rtl-optimization/110939] [14 Regression] 14.0 ICE at rtl.h:2297 while bootstrapping on loongarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110939 --- Comment #6 from Stefan Schulze Frielinghaus --- I tried to reproduce it with a cross compiler while using the reproducer from PR110867 without getting an ICE. Can you attach a pre processed source file and a corresponding gcc invocation?
[Bug middle-end/110869] [14 regression] ICE in decompose, at rtl.h:2297
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110869 --- Comment #18 from Stefan Schulze Frielinghaus --- Thanks again for testing. Very much appreciated! I like the idea of a comment and posted a patch: https://gcc.gnu.org/pipermail/gcc-patches/2023-August/626514.html
[Bug middle-end/110869] [14 regression] ICE in decompose, at rtl.h:2297
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110869 --- Comment #16 from Stefan Schulze Frielinghaus --- Turns out that my dejagnu foo is weak ;-) I came up with a wrong target selector. Should be fixed in the new attachment.
[Bug middle-end/110869] [14 regression] ICE in decompose, at rtl.h:2297
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110869 --- Comment #15 from Stefan Schulze Frielinghaus --- Created attachment 55688 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55688&action=edit Increase optimization and skip sparc for 4-6
[Bug middle-end/110869] [14 regression] ICE in decompose, at rtl.h:2297
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110869 --- Comment #14 from Stefan Schulze Frielinghaus --- For -3 and -4 I can confirm that we do not end up with a proper comparison during combine which means we should just ignore these on Sparc. I'm currently puzzled that -5 and -6 are actually processed on Sparc (32 or 64 bit) at all. Shouldn't this: /* { dg-do compile { target { lp64 } && ! target { sparc*-*-* } } } */ prevent this?
[Bug middle-end/110869] [14 regression] ICE in decompose, at rtl.h:2297
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110869 --- Comment #12 from Stefan Schulze Frielinghaus --- I have done a test with a cross-compiler and it looks to me as if we need -O2 instead of -O1 on Sparc in order to trigger the optimization. Can you give the attached patch a try? Sorry for all the hassle.
[Bug middle-end/110869] [14 regression] ICE in decompose, at rtl.h:2297
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110869 --- Comment #11 from Stefan Schulze Frielinghaus --- Created attachment 55686 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55686&action=edit Increase optimization
[Bug middle-end/110869] [14 regression] ICE in decompose, at rtl.h:2297
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110869 --- Comment #7 from Stefan Schulze Frielinghaus --- I've send a patch for review: https://gcc.gnu.org/pipermail/gcc-patches/2023-August/626075.html and thanks for testing :)
[Bug rtl-optimization/110867] [14 Regression] ICE in combine after 7cdd0860949c6c3232e6cff1d7ca37bb5234074c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110867 --- Comment #4 from Stefan Schulze Frielinghaus --- Thanks for testing so quickly :) I've send a patch for review: https://gcc.gnu.org/pipermail/gcc-patches/2023-August/626075.html
[Bug middle-end/110869] [14 regression] ICE in decompose, at rtl.h:2297
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110869 --- Comment #4 from Stefan Schulze Frielinghaus --- For sparc we already see some sort of pre-optimization which "breaks" the new test cases. For example, for test cmp-mem-const-1.c we have prior combine: (insn 14 13 41 2 (set (reg:SI 117) (ior:SI (reg:SI 118) (const_int 1023 [0x3ff]))) "cmp-mem-const-1.c":10:13 307 {iorsi3} (expr_list:REG_DEAD (reg:SI 118) (expr_list:REG_EQUAL (const_int 1073741823 [0x3fff]) (nil (insn 41 14 42 2 (set (reg:CC 100 %icc) (compare:CC (reg:SI 117) (reg:SI 116 [ *x_2(D) ]))) "cmp-mem-const-1.c":10:13 1 {*cmpsi_insn} (expr_list:REG_DEAD (reg:SI 117) (expr_list:REG_DEAD (reg:SI 116 [ *x_2(D) ]) (nil where the 64-bit constant 0x3fff already got chopped into a 32-bit constant 0x3fff. Thus in combine we only see narrow comparison from mode SI to QI: (MEM leu 0x3fff) to (MEM leu 0x3f) whereas I have been pretty strict in the new tests and demanded to see a 64-bit constant: scan-rtl-dump "narrow comparison from mode DI to QI" "combine" Thus one solution would be to not consider the source mode by using scan-rtl-dump "narrow comparison from mode .I to QI" "combine" This would solve test cases cmp-mem-const-{1,2,3,4}.c. For cmp-mem-const-{5,6} we have that the pre-optimization already chopped the 64-bit constant into a 32-bit constant and thus leaves us with nothing to do here. I'm not entirely sure how we handled such cases in the past. Though, one solution would be to simply exclude sparc from this test: /* { dg-do compile { target { lp64 } && ! target { sparc*-*-* } } } */ Would that be ok?
[Bug rtl-optimization/110867] ICE in combine after 7cdd0860949c6c3232e6cff1d7ca37bb5234074c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110867 Stefan Schulze Frielinghaus changed: What|Removed |Added CC||stefansf at linux dot ibm.com --- Comment #1 from Stefan Schulze Frielinghaus --- The optimization introduced by r14-2879-g7cdd0860949c6c hits during combination of insn (insn 31 3 32 2 (set (reg:SI 118 [ _1 ]) (mem:SI (reg/v/f:SI 115 [ a ]) [1 *a_4(D)+0 S4 A64])) "t.c":15:7 758 {*arm_movsi_vfp} (nil)) and (insn 9 32 10 2 (set (reg:CC 100 cc) (compare:CC (reg:SI 118 [ _1 ]) (const_int -2147483648 [0x8000]))) "t.c":15:6 272 {*arm_cmpsi_insn} (nil)) The idea of r14-2879-g7cdd0860949c6c is to get rid of large constants while performing an unsigned comparison. In this case it looks like a 32-bit constant is sign-extended into a 64-bit constant and then a 32-bit comparison is done. While writing the optimization I always assumed that the constant does fit into int_mode which is apparently not the case here. Thus one possible solution would be to simply bail out in those cases: diff --git a/gcc/combine.cc b/gcc/combine.cc index 0d99fa541c5..e46d202d0a7 100644 --- a/gcc/combine.cc +++ b/gcc/combine.cc @@ -11998,11 +11998,15 @@ simplify_compare_const (enum rtx_code code, machine_mode mode, x0 >= 0x40. */ if ((code == LEU || code == LTU || code == GEU || code == GTU) && is_a (GET_MODE (op0), &int_mode) + && HWI_COMPUTABLE_MODE_P (int_mode) && MEM_P (op0) && !MEM_VOLATILE_P (op0) /* The optimization makes only sense for constants which are big enough so that we have a chance to chop off something at all. */ && (unsigned HOST_WIDE_INT) const_op > 0xff + /* Bail out, if the constant does not fit into INT_MODE. */ + && (unsigned HOST_WIDE_INT) const_op +< ((HOST_WIDE_INT_1U << (GET_MODE_PRECISION (int_mode) - 1) << 1) - 1) /* Ensure that we do not overflow during normalization. */ && (code != GTU || (unsigned HOST_WIDE_INT) const_op < HOST_WIDE_INT_M1U)) { Does this resolve the problem for you?
[Bug middle-end/109265] New: ICE for 527.cam4_r after r13-6787-g0963cb5fde158c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109265 Bug ID: 109265 Summary: ICE for 527.cam4_r after r13-6787-g0963cb5fde158c Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: stefansf at linux dot ibm.com Target Milestone: --- $ gfortran -c -o sgexx.fppized.o -I. -Iinclude -Inetcdf/include -O3 -std=legacy -fconvert=big-endian sgexx.F90 during GIMPLE pass: dom sgexx.F90:8996:23: 8996 | SUBROUTINE SLAMC2( BETA, T, RND, EPS, EMIN, RMIN, EMAX, RMAX ) | ^ internal compiler error: in in_chain_p, at gimple-range-gori.cc:119 0x24e31cf range_def_chain::in_chain_p(tree_node*, tree_node*) /devel/gcc/src/gcc/gimple-range-gori.cc:119 0x24e47e9 gori_compute::compute_operand_range(vrange&, gimple*, vrange const&, tree_node*, fur_source&, value_relation*) /devel/gcc/src/gcc/gimple-range-gori.cc:667 0x24e5b47 gori_compute::compute_operand1_range(vrange&, gimple_range_op_handler&, vrange const&, tree_node*, fur_source&, value_relation*) /devel/gcc/src/gcc/gimple-range-gori.cc:1174 0x24e4739 gori_compute::compute_operand_range(vrange&, gimple*, vrange const&, tree_node*, fur_source&, value_relation*) /devel/gcc/src/gcc/gimple-range-gori.cc:726 0x24e6505 gori_compute::compute_operand2_range(vrange&, gimple_range_op_handler&, vrange const&, tree_node*, fur_source&, value_relation*) /devel/gcc/src/gcc/gimple-range-gori.cc:1254 0x24e6a91 gori_compute::compute_operand1_and_operand2_range(vrange&, gimple_range_op_handler&, vrange const&, tree_node*, fur_source&, value_relation*) /devel/gcc/src/gcc/gimple-range-gori.cc:1274 0x24e4911 gori_compute::compute_operand_range(vrange&, gimple*, vrange const&, tree_node*, fur_source&, value_relation*) /devel/gcc/src/gcc/gimple-range-gori.cc:723 0x24e8b75 gori_compute::outgoing_edge_range_p(vrange&, edge_def*, tree_node*, range_query&) /devel/gcc/src/gcc/gimple-range-gori.cc:1384 0x24d71c5 ranger_cache::edge_range(vrange&, edge_def*, tree_node*, ranger_cache::rfd_mode) /devel/gcc/src/gcc/gimple-range-cache.cc:964 0x24d732d ranger_cache::range_on_edge(vrange&, edge_def*, tree_node*) /devel/gcc/src/gcc/gimple-range-cache.cc:1001 0x24df129 fold_using_range::range_of_range_op(vrange&, gimple_range_op_handler&, fur_source&) /devel/gcc/src/gcc/gimple-range-fold.cc:558 0x24e11c1 fold_using_range::fold_stmt(vrange&, gimple*, fur_source&, tree_node*) /devel/gcc/src/gcc/gimple-range-fold.cc:489 0x24e171d fold_range(vrange&, gimple*, edge_def*, range_query*) /devel/gcc/src/gcc/gimple-range-fold.cc:326 0x24e8e03 gori_compute::outgoing_edge_range_p(vrange&, edge_def*, tree_node*, range_query&) /devel/gcc/src/gcc/gimple-range-gori.cc:1411 0x24d6c55 ranger_cache::range_from_dom(vrange&, tree_node*, basic_block_def*, ranger_cache::rfd_mode) /devel/gcc/src/gcc/gimple-range-cache.cc:1524 0x24d8e9b ranger_cache::range_from_dom(vrange&, tree_node*, basic_block_def*, ranger_cache::rfd_mode) /devel/gcc/src/gcc/gimple-range-cache.cc:1421 0x24d8e9b ranger_cache::fill_block_cache(tree_node*, basic_block_def*, basic_block_def*) /devel/gcc/src/gcc/gimple-range-cache.cc:1212 0x24d9d2f ranger_cache::block_range(vrange&, basic_block_def*, tree_node*, bool) /devel/gcc/src/gcc/gimple-range-cache.cc:1039 0x24cedc9 gimple_ranger::range_on_entry(vrange&, basic_block_def*, tree_node*) /devel/gcc/src/gcc/gimple-range.cc:156 0x24d23b3 gimple_ranger::range_of_expr(vrange&, tree_node*, gimple*) /devel/gcc/src/gcc/gimple-range.cc:130 Please submit a full bug report, with preprocessed source (by using -freport-bug). Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. Started after r13-6787-g0963cb5fde158c I have no experience with reducing a Fortran program. Is there an equivalent way of -E for gfortran? Reducing via delta results in dependencies like 9 | use shr_kind_mod,only: r8 => shr_kind_r8 | 1 Fatal Error: Cannot open module file 'shr_kind_mod.mod' for reading at (1): No such file or directory compilation terminated. Currently I circumvent those errors by copying modules into my temp directory but of course in the end I cannot upload those modules.
[Bug middle-end/108102] rust bootstrap comparison failure on s390x-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108102 --- Comment #16 from Stefan Schulze Frielinghaus --- Fixed in mainline. Fine for me to close this now.
[Bug tree-optimization/108687] [13 Regression] Non-termination since r13-5630-g881bf8de9b0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108687 --- Comment #10 from Stefan Schulze Frielinghaus --- Can confirm the attached patch solves this issue.
[Bug tree-optimization/108687] [13 Regression] Non-termination since r13-5630-g881bf8de9b0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108687 --- Comment #8 from Stefan Schulze Frielinghaus --- I came up with a cross compiler where I can reproduce it: FROM fedora:37 RUN dnf -y upgrade \ && dnf -y install 'dnf-command(builddep)' \ && dnf -y builddep gcc \ && dnf -y install binutils-s390x-linux-gnu git \ && dnf clean metadata RUN git clone --depth 1 https://gcc.gnu.org/git/gcc.git \ && mkdir /build \ && cd /build \ && /gcc/configure --target=s390x-linux-gnu \ --enable-languages=c \ --disable-nls \ --without-headers \ --disable-multilib \ && make -j$(nproc) all-gcc \ && make install-gcc Running inside the container $ /usr/local/bin/s390x-linux-gnu-gcc -O3 -march=z13 -c t.c does not terminate for me. Hope this makes debugging easier. If you need anything else just let me know.
[Bug tree-optimization/108687] [13 Regression] Non-termination since r13-5630-g881bf8de9b0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108687 --- Comment #6 from Stefan Schulze Frielinghaus --- Just to be sure: in the initial commit I missed adding -march=z13 and only mentioned it in commit 2 I will come up with those logs and mail them to you.
[Bug middle-end/108102] rust bootstrap comparison failure on s390x-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108102 --- Comment #14 from Stefan Schulze Frielinghaus --- I'm still working on this and currently test a new patch which should fix the scheduler handling in the backend.
[Bug middle-end/108687] Non-termination since r13-5630-g881bf8de9b0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108687 --- Comment #4 from Stefan Schulze Frielinghaus --- I have added a backtrace from GDB where I randomly interrupted. Hope this helps to narrow it down.
[Bug middle-end/108687] Non-termination since r13-5630-g881bf8de9b0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108687 --- Comment #3 from Stefan Schulze Frielinghaus --- Created attachment 54415 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54415&action=edit Random backtrace after some time
[Bug middle-end/108687] Non-termination since r13-5630-g881bf8de9b0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108687 --- Comment #2 from Stefan Schulze Frielinghaus --- (In reply to Stefan Schulze Frielinghaus from comment #0) > Running gcc -O3 -c t.c on s390x does not terminate. More specifically: gcc -O3 -march=z13 -c t.c
[Bug middle-end/108687] New: Non-termination since gcc-13-5630-g881bf8de9b0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108687 Bug ID: 108687 Summary: Non-termination since gcc-13-5630-g881bf8de9b0 Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: stefansf at linux dot ibm.com Target Milestone: --- typedef struct { int a[32]; } b; b c; int d, e, f; void g () { for (; c.a[f - 1]; f++) { e = e * d; c.a[f] = f / d; } } Running gcc -O3 -c t.c on s390x does not terminate. Started with gcc-13-5630-g881bf8de9b0
[Bug middle-end/108102] rust bootstrap comparison failure on s390x-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108102 --- Comment #12 from Stefan Schulze Frielinghaus --- The culprit seems to be that s390_sched_init is not called in one particular case. We have the following basic blocks and edges: 6 --> 12 --> 13 --> 14 The edges from 12 to 13 and 13 to 14 are fall-through edges which means in function s390_sched_init we "inherit" last_scheduled_unit_distance from the previous block, i.e., we do not zero it. The edge from 6 to 12 is a non-fall-through edge which means if we schedule bb 12, then s390_sched_init will be called and last_scheduled_unit_distance will be zeroed. The culprit seems to be that bb 12 is empty if no debug information is generated or in case debug information is generated then it contains only debug insns. Thus, in the non-debug case when bb 12 is empty it is never scheduled and therefore s390_sched_init is never called and therefore last_scheduled_unit_distance is never zeroed. We also see this once inspecting last_scheduled_unit_distance at the very beginning of function schedule_block for bb 13 where we have: non-debug: 2 2 0 2 34 0 34 29 debug: 0 0 0 0 0 0 0 0 In the debug-case we "inherit" for bb 13 from bb 12 last_scheduled_unit_distance which got cleared once bb 12 was scheduled. In the non-debug case we also "inherit" the array but it did not get cleared in bb 12 because it was never scheduled.
[Bug middle-end/108102] rust bootstrap comparison failure on s390x-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108102 --- Comment #11 from Stefan Schulze Frielinghaus --- Please find attached a reduced version of the initial problem. If compiled with g++ -O2 -march=arch13 -fno-exceptions (-g) there is still a difference whether build with debug information or not: diff <(objdump -d reduced.o-without-debug) <(objdump -d reduced.o-with-debug) 2c2 < reduced.o-without-debug: file format elf64-s390 --- > reduced.o-with-debug: file format elf64-s390 94,97c94,97 < 1b8: e5 48 f0 a8 00 00 mvghi 168(%r15),0 < 1be: e3 50 f0 c8 00 04 lg %r5,200(%r15) < 1c4: 41 30 f0 a0 la %r3,160(%r15) < 1c8: e3 50 f0 a0 00 24 stg %r5,160(%r15) --- > 1b8: e3 50 f0 c8 00 04 lg %r5,200(%r15) > 1be: e5 48 f0 a8 00 00 mvghi 168(%r15),0 > 1c4: e3 50 f0 a0 00 24 stg %r5,160(%r15) > 1ca: 41 30 f0 a0 la %r3,160(%r15) The corresponding insns are: Without debug information: mvghi => insn 207 lg=> insn 206 la=> insn 310 stg => insn 312 With debug information: lg=> insn 427 mvghi => insn 428 stg => insn 533 la=> insn 531 In split3 the order of the insns are the same and change in sched2 where we have: Without debug information: ;; == ;; -- basic block 14 from 87 to 355 -- after reload ;; == ;;0--> b 0: i 87 %r2=0 :nothing ;;1--> b 0: i 88 {%r2=call [`_ZN4Rust4TyTy9ParamType7resolveEv'];clobber %r14;}:nothing ;;2--> b 0: i 207 [%r15+0xa8]=0 :nothing ;;3--> b 0: i 206 %r5=[%r15+0xc8] :nothing ;;4--> b 0: i 310 %r3=%r15+0xa0 :nothing ;;5--> b 0: i 312 [%r15+0xa0]=%r5 :nothing ;;6--> b 0: i 311 %r2=%r15+0xc0 :nothing ;;7--> b 0: i 96 {call [`_ZNSt6vectorIPN4Rust4TyTy8BaseTypeESaIS3_EE17_M_realloc_insertIN9__gnu_cxx17__normal_iteratorIPS3_S5_vT_'];clobber %r14;}:nothing ;;8--> b 0: i 355 pc=L174 :nothing ;; Ready list (final): ;; total time = 8 ;; new head = 87 ;; new tail = 355 With debug information: ;; == ;; -- basic block 14 from 201 to 585 -- after reload ;; == ;;0--> b 0: i 201 debug_marker:nothing ;;0--> b 0: i 202 %r2=0 :nothing ;;1--> b 0: i 203 {%r2=call [`_ZN4Rust4TyTy9ParamType7resolveEv'];clobber %r14;}:nothing ;;1--> b 0: i 204 debug_marker:nothing ;;1--> b 0: i 205 loc %r15+0xc0 :nothing ;;1--> b 0: i 206 debug_marker:nothing ;;1--> b 0: i 207 loc %r15+0xc0 :nothing ;;1--> b 0: i 208 debug_marker:nothing ;;1--> b 0: i 210 loc debug_implicit_ptr :nothing ;;1--> b 0: i 211 loc [%r15+0xc8] :nothing ;;1--> b 0: i 212 debug_marker:nothing ;;1--> b 0: i 214 loc clobber :nothing ;;1--> b 0: i 215 loc clobber :nothing ;;1--> b 0: i 216 loc clobber :nothing ;;2--> b 0: i 427 %r5=[%r15+0xc8] :nothing ;;3--> b 0: i 428 [%r15+0xa8]=0 :nothing ;;4--> b 0: i 533 [%r15+0xa0]=%r5 :nothing ;;5--> b 0: i 531 %r3=%r15+0xa0 :nothing ;;6--> b 0: i 532 %r2=%r15+0xc0 :nothing ;;7--> b 0: i 222 {call [`_ZNSt6vectorIPN4Rust4TyTy8BaseTypeESaIS3_EE17_M_realloc_insertIN9__gnu_cxx17__normal_iteratorIPS3_S5_vT_'];clobber %r14;}:nothing ;;7--> b 0: i 223 loc clobber :nothing ;;8--> b 0: i 585 pc=L373 :nothing ;; Ready list (final): ;; total time = 8 ;; new head = 201 ;; new tail = 585
[Bug middle-end/108102] rust bootstrap comparison failure on s390x-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108102 --- Comment #10 from Stefan Schulze Frielinghaus --- Created attachment 54279 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54279&action=edit RTL dump of sched2 if compiled with debug information
[Bug middle-end/108102] rust bootstrap comparison failure on s390x-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108102 --- Comment #9 from Stefan Schulze Frielinghaus --- Created attachment 54278 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54278&action=edit RTL dump of sched2 if compiled without debug information
[Bug middle-end/108102] rust bootstrap comparison failure on s390x-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108102 --- Comment #8 from Stefan Schulze Frielinghaus --- Created attachment 54277 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54277&action=edit reduced version of the initial problem
[Bug middle-end/108102] rust bootstrap comparison failure on s390x-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108102 --- Comment #7 from Stefan Schulze Frielinghaus --- The difference in the assembly output shown in comment 2 happens in function void AssociatedImplTrait::setup_associated_types ( const TyTy::BaseType *self, const TyTy::TypeBoundPredicate &bound)
[Bug middle-end/108102] rust bootstrap comparison failure on s390x-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108102 --- Comment #6 from Stefan Schulze Frielinghaus --- Created attachment 54154 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54154&action=edit preprocessed rust-hir-trait-resolve.cc
[Bug middle-end/108102] rust bootstrap comparison failure on s390x-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108102 --- Comment #4 from Stefan Schulze Frielinghaus --- I was playing around with this and was wondering how can I actually execute the stageN compiler? From the output of make I see two compilations for object rust-hir-trait-resolve.o. Thus the first one must be for stage2 and the second one for stage3. For the former the command line is /devel/gcc/build/./prev-gcc/xg++ -B/devel/gcc/build/./prev-gcc/ -B/devel/gcc/dst/s390x-ibm-linux-gnu/bin/ -nostdinc++ -B/devel/gcc/build/prev-s390x-ibm-linux-gnu/libstdc++-v3/src/.libs -B/devel/gcc/build/prev-s390x-ibm-linux-gnu/libstdc++-v3/libsupc++/.libs -I/devel/gcc/build/prev-s390x-ibm-linux-gnu/libstdc++-v3/include/s390x-ibm-linux-gnu -I/devel/gcc/build/prev-s390x-ibm-linux-gnu/libstdc++-v3/include -I/devel/gcc/src/libstdc++-v3/libsupc++ -L/devel/gcc/build/prev-s390x-ibm-linux-gnu/libstdc++-v3/src/.libs -L/devel/gcc/build/prev-s390x-ibm-linux-gnu/libstdc++-v3/libsupc++/.libs -fno-PIE -c -DIN_GCC_FRONTEND -g -O2 -fno-checking -gtoggle -DIN_GCC -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -Wconditionally-supported -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror -Wno-unused-parameter -fno-common -DHAVE_CONFIG_H -I. -Irust -I/devel/gcc/src/gcc -I/devel/gcc/src/gcc/rust -I/devel/gcc/src/gcc/../include -I/devel/gcc/src/gcc/../libcpp/include -I/devel/gcc/src/gcc/../libcody -I/devel/gcc/src/gcc/../libdecnumber -I/devel/gcc/src/gcc/../libdecnumber/dpd -I../libdecnumber -I/devel/gcc/src/gcc/../libbacktrace -o rust/rust-hir-trait-resolve.o -MT rust/rust-hir-trait-resolve.o -MMD -MP -MF rust/.deps/rust-hir-trait-resolve.TPo -g -O2 -fno-checking -gtoggle -I /devel/gcc/src/gcc/rust -I /devel/gcc/src/gcc/rust/lex -I /devel/gcc/src/gcc/rust/parse -I /devel/gcc/src/gcc/rust/ast -I /devel/gcc/src/gcc/rust/analysis -I /devel/gcc/src/gcc/rust/backend -I /devel/gcc/src/gcc/rust/expand -I /devel/gcc/src/gcc/rust/hir/tree -I /devel/gcc/src/gcc/rust/hir -I /devel/gcc/src/gcc/rust/resolve -I /devel/gcc/src/gcc/rust/util -I /devel/gcc/src/gcc/rust/typecheck -I /devel/gcc/src/gcc/rust/checks/lints -I /devel/gcc/src/gcc/rust/checks/errors -I /devel/gcc/src/gcc/rust/checks/errors/privacy -I /devel/gcc/src/gcc/rust/util -I /devel/gcc/src/gcc/rust/metadata /devel/gcc/src/gcc/rust/typecheck/rust-hir-trait-resolve.cc and the current working directory was most likely /devel/gcc/build/gcc. Creating a symlink from $build/stage1-gcc to $build/prev-gcc and then running the command from above doesn't do the trick. There is probably an easier way which I miss. Any hints?
[Bug rust/108102] rust bootstrap comparison failure on s390x-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108102 Stefan Schulze Frielinghaus changed: What|Removed |Added CC||stefansf at linux dot ibm.com --- Comment #2 from Stefan Schulze Frielinghaus --- Can confirm. Happens with --with-arch=arch13 and started since adding rust to languages via commit r13-4676-ga75f038c069cc3. $ diff <(objdump -d stage2-gcc/rust/rust-hir-trait-resolve.o) \ <(objdump -d stage3-gcc/rust/rust-hir-trait-resolve.o) 2c2 < stage2-gcc/rust/rust-hir-trait-resolve.o: file format elf64-s390 --- > stage3-gcc/rust/rust-hir-trait-resolve.o: file format elf64-s390 1939,1940c1939,1940 < 24ec: e3 20 f2 50 00 24 stg %r2,592(%r15) < 24f2: e3 30 f1 28 00 04 lg %r3,296(%r15) --- > 24ec: e3 30 f1 28 00 04 lg %r3,296(%r15) > 24f2: e3 20 f2 50 00 24 stg %r2,592(%r15)
[Bug middle-end/107088] [13 Regression] cselib ICE building __trunctfxf2 on ia64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107088 --- Comment #10 from Stefan Schulze Frielinghaus --- (In reply to rsand...@gcc.gnu.org from comment #8) > Looks good, but maybe: > > GET_MODE_SIZE (int_mode) > 1 > > would be more general. I very much like the idea of a size guard. Posted a patch: https://gcc.gnu.org/pipermail/gcc-patches/2022-October/602776.html
[Bug middle-end/107088] [13 Regression] cselib ICE building __trunctfxf2 on ia64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107088 --- Comment #6 from Stefan Schulze Frielinghaus --- I did a quick test using diff --git a/gcc/cselib.cc b/gcc/cselib.cc index 9b582e5d3d6..2fd0190bc79 100644 --- a/gcc/cselib.cc +++ b/gcc/cselib.cc @@ -1571,6 +1571,7 @@ new_cselib_val (unsigned int hash, machine_mode mode, rtx x) scalar_int_mode int_mode; if (REG_P (x) && is_int_mode (mode, &int_mode) + && int_mode != BImode && REG_VALUES (REGNO (x)) != NULL && (!cselib_current_insn || !DEBUG_INSN_P (cselib_current_insn))) { which solved the cross ia64 build for me. Maybe there are further integer modes which I didn't consider, i.e., I will have a thorough look at it next week.
[Bug rtl-optimization/107094] [13 Regression] ICE in require, at machmode.h:297 since r13-2916-gd0b00b63a39108
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107094 --- Comment #1 from Stefan Schulze Frielinghaus --- Looks like related to PR107088
[Bug middle-end/107088] [13 Regression] cselib ICE building __trunctfxf2 on ia64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107088 --- Comment #5 from Stefan Schulze Frielinghaus --- Thanks for looking into this! Currently I'm out of office and have very limited internet access. I will be back on Tuesday and look right into this. If this is to late feel free to revert my patch. Sorry for the inconvenience!
[Bug middle-end/107088] [13 Regression] cselib ICE building __trunctfxf2 on ia64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107088 --- Comment #1 from Stefan Schulze Frielinghaus --- The patch introduces scalar_int_mode int_mode; if (REG_P (x) && is_int_mode (mode, &int_mode) && REG_VALUES (REGNO (x)) != NULL && (!cselib_current_insn || !DEBUG_INSN_P (cselib_current_insn))) { rtx copy = shallow_copy_rtx (x); scalar_int_mode narrow_mode_iter; FOR_EACH_MODE_UNTIL (narrow_mode_iter, int_mode) // < { PUT_MODE_RAW (copy, narrow_mode_iter); cselib_val *v = cselib_lookup (copy, narrow_mode_iter, 0, VOIDmode); if (v) { rtx sub = lowpart_subreg (narrow_mode_iter, e->val_rtx, int_mode); if (sub) new_elt_loc_list (v, sub); } } } The failing assert is at the for-loop which is supposed to iterate only over integer modes up to int_mode. I'm not familiar with ia64; is there any machine which I could use for debugging? The failing assert is gcc_checking_assert (m_mode != E_VOIDmode); which is triggered by get_known_wider. Would be interesting to see the initial value of int_mode and if/how FOR_EACH_MODE_UNTIL actually ends up with E_VOIDmode.
[Bug target/106355] Linux s390x -O2 argument passing miscompile
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106355 --- Comment #4 from Stefan Schulze Frielinghaus --- The problem is with sibling call optimization where parameters with BLKmode are not handled correctly. I will prepare a patch and submit it shortly.
[Bug debug/100960] var-tracking: parameter location in subregister not tracked
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100960 --- Comment #6 from Stefan Schulze Frielinghaus --- Created attachment 53433 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53433&action=edit a-t2.c.325r.vartrack
[Bug debug/100960] var-tracking: parameter location in subregister not tracked
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100960 --- Comment #5 from Stefan Schulze Frielinghaus --- However, I found another example (see attachment a-t2.c.325r.vartrack) which does not profit from the patch: __attribute__((noinline, noclone)) void fn1 (int x) { __asm volatile ("" : "+r" (x) : : "memory"); } __attribute__((noinline, noclone)) int fn2 (int x, int y) { if (x) { // x is copied into call-saved r11 fn1 (x); // locs of x point to entry value only // ignoring r11 fn1 (x); } return y; } __attribute__((noinline, noclone)) int fn3 (int x, int y) { return fn2 (x, y); } int main () { fn3 (36, 25); return 0; } For fn2 the value for parameter x is 5:5 cselib hash table: ... (value/u:SI 5:5 @0x5fb9420/0x5f5e600) locs: from insn 1 (value/u:SI 6:263 @0x5fb9438/0x5f5e630) from insn 1 (entry_value:SI (reg:SI 2 %r2 [ xD.2274 ])) from insn 1 (reg:SI 2 %r2 [ xD.2274 ]) no addrs which is recorded in bb 2. In bb 4 (the true branch of the if) register r2 is saved in r11: bb 4 op 0 insn 36 MO_VAL_USE (concat/v:DI (value/u:DI 26:26 @0x5fb9618/0x5f5e9f0) (reg:DI 2 %r2 [64])) bb 4 op 1 insn 36 MO_VAL_SET (concat/u:DI (value/u:DI 26:26 @0x5fb9618/0x5f5e9f0) (set (reg/v:DI 11 %r11 [orig:61 xD.2274+-4 ] [61]) (reg:DI 2 %r2 [64]))) (insn 36 10 11 4 (set (reg/v:DI 11 %r11 [orig:61 xD.2274+-4 ] [61]) (reg:DI 2 %r2 [64])) 1472 {*movdi_64} (nil)) cselib hash table: (value/u:DI 26:26 @0x5fb9618/0x5f5e9f0) locs: from insn 36 (reg/v:DI 11 %r11 [orig:61 xD.2274+-4 ] [61]) from insn 36 (reg:DI 2 %r2 [64]) no addrs cselib preserved hash table: ... (value/u:SI 5:5 @0x5fb9420/0x5f5e600) locs: from insn 1 (value/u:SI 6:263 @0x5fb9438/0x5f5e630) from insn 1 (entry_value:SI (reg:SI 2 %r2 [ xD.2274 ])) no addrs However at bb 4 the relation between r2 and value 5:5 is lost (except the entry value relation). Thus I cannot record the subvalue relation between 5:5 and 26:26 at least not during creation of 26:26. Since cselib resets its table after jumps I'm not sure how to proceed here. Any ideas? I would be also up for the second idea and pretend that the move is not a DImode copy but a SImode copy. However, I'm not sure how to look up the mode of the actual type. Any pointers?
[Bug debug/100960] var-tracking: parameter location in subregister not tracked
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100960 --- Comment #4 from Stefan Schulze Frielinghaus --- I really like the idea of enhancing cselib since there is a chance that other passes might profit from it, too. The following patch fixes the initial reported problem: diff --git a/gcc/cselib.cc b/gcc/cselib.cc index 6a5609786fa..64b6996a299 100644 --- a/gcc/cselib.cc +++ b/gcc/cselib.cc @@ -1569,6 +1569,25 @@ new_cselib_val (unsigned int hash, machine_mode mode, rtx x) e->locs = 0; e->next_containing_mem = 0; + scalar_int_mode int_mode; + if (REG_P (x) && is_int_mode (mode, &int_mode) && REG_VALUES (REGNO (x)) != NULL + && (!cselib_current_insn || !DEBUG_INSN_P (cselib_current_insn))) +{ + rtx copy = shallow_copy_rtx (x); + scalar_int_mode narrow_mode; + FOR_EACH_MODE_UNTIL(narrow_mode, int_mode) + { + PUT_MODE_RAW (copy, narrow_mode); + cselib_val *v = cselib_lookup (copy, narrow_mode, 0, VOIDmode); + if (v) + { + rtx sub = lowpart_subreg (narrow_mode, e->val_rtx, int_mode); + if (sub) + new_elt_loc_list (v, sub); + } + } +} + if (dump_file && (dump_flags & TDF_CSELIB)) { fprintf (dump_file, "cselib value %u:%u ", e->uid, hash); So I get the subvalue relation between 5:5 and 14:14 (was initially 15:15 but changed meanwhile due to new GCC version) (value/u:SI 5:5 @0x4f906e0/0x4f80730) locs: from insn 17 (subreg:SI (value/u:DI 14:14 @0x4f907b8/0x4f808e0) 4) from insn 1 (value/u:SI 6:263 @0x4f906f8/0x4f80760) from insn 1 (entry_value:SI (reg:SI 2 %r2 [ xD.2274 ])) no addrs
[Bug rtl-optimization/101260] [10/11 Regression] regcprop: Determine subreg offset depending on endianness
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101260 --- Comment #18 from Stefan Schulze Frielinghaus --- Fixed for 12 and mainline.
[Bug rtl-optimization/104814] [10/11 Regression] ifcvt: Deleting live variable in IF-CASE-2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104814 --- Comment #7 from Stefan Schulze Frielinghaus --- Gave trunk a try and it worked fine for me. Thanks for the fix!
[Bug rtl-optimization/104814] [10/11/12 Regression] ifcvt: Deleting live variable in IF-CASE-2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104814 --- Comment #3 from Stefan Schulze Frielinghaus --- Oh forgot to mention it is just: gcc -O1 t.c Works fine with -O{0,2,3}
[Bug rtl-optimization/104814] [10/11/12 Regression] ifcvt: Deleting live variable in IF-CASE-2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104814 --- Comment #1 from Stefan Schulze Frielinghaus --- Created attachment 52571 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52571&action=edit dump combine
[Bug rtl-optimization/104814] New: [10/11/12 Regression] ifcvt: Deleting live variable in IF-CASE-2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104814 Bug ID: 104814 Summary: [10/11/12 Regression] ifcvt: Deleting live variable in IF-CASE-2 Product: gcc Version: 12.0 Status: UNCONFIRMED Keywords: wrong-code Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: stefansf at linux dot ibm.com Target Milestone: --- Target: s390x-*-* Created attachment 52570 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52570&action=edit dump ce2 short a = 0; static long b = 0; int c = 7; char d = 0; short *e = &a; long f = 0; unsigned long g (unsigned long h, long j) { return j == 0 ? h : h / j; } int main (void) { long k = f; for (; c; --c) { for (int i = 0; i < 7; ++i) ; long m = g (f, --b); d = ((char)m | *e) <= 43165; } if (b != -7) __builtin_abort (); return 0; } Variable b should be decremented in each iteration of the outer loop and thus equal -7 at the end. After combine a load, decrement, and store insn exists: (insn 13 12 14 3 (set (reg:DI 62 [ b_lsm.16 ]) (mem/c:DI (const:DI (plus:DI (symbol_ref:DI ("*.LANCHOR0") [flags 0x182]) (const_int 8 [0x8]))) [1 b+0 S8 A64])) 1469 {*movdi_64} (nil)) (jump_insn 40 39 46 6 (parallel [ (set (pc) (if_then_else (ne (reg:DI 62 [ b_lsm.16 ]) (const_int 1 [0x1])) (label_ref 38) (pc))) (set (reg:DI 62 [ b_lsm.16 ]) (plus:DI (reg:DI 62 [ b_lsm.16 ]) (const_int -1 [0x]))) (clobber (scratch:DI)) (clobber (reg:CC 33 %cc)) ]) "t.c":8:63 2164 {doloop_di} (expr_list:REG_UNUSED (reg:CC 33 %cc) (int_list:REG_BR_PROB 536870916 (nil))) -> 38) (insn 48 47 49 7 (set (mem/c:DI (const:DI (plus:DI (symbol_ref:DI ("*.LANCHOR0") [flags 0x182]) (const_int 8 [0x8]))) [1 b+0 S8 A64]) (reg:DI 62 [ b_lsm.16 ])) 1469 {*movdi_64} (expr_list:REG_DEAD (reg:DI 62 [ b_lsm.16 ]) (nil))) Pass ce2 deletes jump insn 40 including the decrement of variable b: IF-CASE-2 found, start 6, else 4 deleting insn with uid = 40. deleting block 4 Conversion succeeded on pass 1. Thus variable b equals 0 in the end.
[Bug tree-optimization/103063] New: Wrong code while using -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103063 Bug ID: 103063 Summary: Wrong code while using -O3 Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: stefansf at linux dot ibm.com Target Milestone: --- int a = 0; unsigned char b = 0; int main() { a - 6; for (; a >= -13; a = a - 8) while((unsigned char)(b-- * 6)) ; if (b != 127) __builtin_abort(); return 0; } Running the example while compiled with -O{0,1,2} works fine whereas it fails with -O{3,fast} using gcc-12-4860-g73658e70d9e. Couldn't find a good commit so far. Fails on IBM Z as well as x64. Still not sure where it fails. ifcvt looks good to me.
[Bug tree-optimization/102752] [12 Regression] Recent change to ldist causing ICE on msp430-elf, rl78-elf, and xstormy16-elf
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102752 --- Comment #6 from Stefan Schulze Frielinghaus --- Thanks for confirmation! Bootstrap and regtest are still running on x86 as well as IBM Z. I will commit the attached patch assuming successful runs.
[Bug tree-optimization/102752] [12 Regression] Recent change to ldist causing ICE on msp430-elf, rl78-elf, and xstormy16-elf
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102752 --- Comment #5 from Stefan Schulze Frielinghaus --- Created attachment 51606 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51606&action=edit Fix determining precission of reduction_var
[Bug tree-optimization/102752] [12 Regression] Recent change to ldist causing ICE on msp430-elf, rl78-elf, and xstormy16-elf
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102752 --- Comment #2 from Stefan Schulze Frielinghaus --- It looks like I missed to take the TREE_TYPE of reduction_var. I just did a quick test with diff --git a/gcc/tree-loop-distribution.c b/gcc/tree-loop-distribution.c index fb9250031b5..0559b9c47d7 100644 --- a/gcc/tree-loop-distribution.c +++ b/gcc/tree-loop-distribution.c @@ -3430,7 +3430,7 @@ generate_strlen_builtin_using_rawmemchr (loop_p loop, tree reduction_var, static bool reduction_var_overflows_first (tree reduction_var, tree load_type) { - widest_int n2 = wi::lshift (1, TYPE_PRECISION (reduction_var));; + widest_int n2 = wi::lshift (1, TYPE_PRECISION (TREE_TYPE (reduction_var)));; widest_int m2 = wi::lshift (1, TYPE_PRECISION (ptrdiff_type_node) - 1); widest_int s = wi::to_widest (TYPE_SIZE_UNIT (load_type)); return wi::ltu_p (n2, wi::udiv_trunc (m2, s)); @@ -3681,7 +3681,7 @@ loop_distribution::transform_reduction_loop (loop_p loop) && ((TYPE_PRECISION (sizetype) >= TYPE_PRECISION (ptr_type_node) - 1 && TYPE_PRECISION (ptr_type_node) >= 32) || (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (reduction_var)) - && TYPE_PRECISION (reduction_var) <= TYPE_PRECISION (sizetype))) + && TYPE_PRECISION (TREE_TYPE (reduction_var)) <= TYPE_PRECISION (sizetype))) && builtin_decl_implicit (BUILT_IN_STRLEN)) generate_strlen_builtin (loop, reduction_var, load_iv.base, reduction_iv.base, loc); successfully. It's getting late here. I will come back to this tomorrow morning. Sorry for the inconvenience.
[Bug tree-optimization/102720] [12 regression] gcc.dg/tree-ssa/ldist-strlen-1.c and ldist-strlen-2.c fail after r12-4324
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102720 Stefan Schulze Frielinghaus changed: What|Removed |Added CC||stefansf at linux dot ibm.com --- Comment #4 from Stefan Schulze Frielinghaus --- typedef __SIZE_TYPE__ size_t; extern void* malloc (size_t); extern void* memset (void*, int, size_t); __attribute__((noinline)) int test (char *s) { int i; for (i=0; s[i]; ++i); return i; } int main (void) { char *p = malloc (1024); memset (p, 0xf, 1024); // removed p[1] = 0; // removed int i = test (p); if (i != 1) __builtin_abort (); return 0; } $ gcc -O2 -fdump-tree-dse2-details test.c In dse2 we then have: Deleted dead store: MEM[(char *)p_3 + 1B] = 0; Deleted dead call: memset (p_3, 15, 1024); prior g:008e7397dad971c03c08fc1b0a4a98fddccaaed8 the store and call is not removed.
[Bug rtl-optimization/101260] [10/11/12 Regression] Backport 27381e78925 to GCC 11
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101260 --- Comment #10 from Stefan Schulze Frielinghaus --- In regcprop we call find_oldest_value_reg which itself calls maybe_mode_change (TImode, TImode, DImode, 10, 18) where we have regno += subreg_regno_offset (regno, orig_mode, offset, new_mode); The call is made where offset equals 8 which is wrong since we are interested in the high part which is contained in r10 and not r11. The following patch fixes this: diff --git a/gcc/regcprop.c b/gcc/regcprop.c index d2a01130fe1..0e1ac12458a 100644 --- a/gcc/regcprop.c +++ b/gcc/regcprop.c @@ -414,9 +414,14 @@ maybe_mode_change (machine_mode orig_mode, machine_mode copy_mode, copy_nregs, &bytes_per_reg)) return NULL_RTX; poly_uint64 copy_offset = bytes_per_reg * (copy_nregs - use_nregs); - poly_uint64 offset - = subreg_size_lowpart_offset (GET_MODE_SIZE (new_mode) + copy_offset, - GET_MODE_SIZE (orig_mode)); + poly_uint64 offset = +#if WORDS_BIG_ENDIAN + subreg_size_highpart_offset +#else + subreg_size_lowpart_offset +#endif + (GET_MODE_SIZE (new_mode) + copy_offset, +GET_MODE_SIZE (orig_mode)); regno += subreg_regno_offset (regno, orig_mode, offset, new_mode); if (targetm.hard_regno_mode_ok (regno, new_mode)) return gen_raw_REG (new_mode, regno); With the patch (insn 234 222 235 14 (set (reg:DI 10 %r10 [ a ]) (reg:DI 18 %f4)) 1376 {*movdi_64} (nil)) is first modified into a noop (insn 234 222 235 14 (set (reg:DI 10 %r10 [ a ]) (reg:DI 10 %r10 [18])) 1376 {*movdi_64} (nil)) and then deleted within regcprop.
[Bug middle-end/95681] False positive uninitialized variable usage in decNumberCompareTotalMag
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95681 --- Comment #4 from Stefan Schulze Frielinghaus --- Running todays mainline (d97d71a1989) using options -O3 -Wall against the reduced program on x86 as well as s390x results in t.c: In function 'decNumberCompareTotalMag': t.c:55:14: warning: '*allocbufa.bits' may be used uninitialized [-Wmaybe-uninitialized] 55 | a->bits&=~0x80; | ^~ t.c:70:14: warning: '*allocbufb.bits' may be used uninitialized [-Wmaybe-uninitialized] 70 | b->bits&=~0x80; | ^~
[Bug middle-end/95681] False positive uninitialized variable usage in decNumberCompareTotalMag
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95681 --- Comment #3 from Stefan Schulze Frielinghaus --- Created attachment 51160 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51160&action=edit Reduced program
[Bug tree-optimization/101260] [10/11 Regression] Backport 27381e78925 to GCC 11
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101260 --- Comment #8 from Stefan Schulze Frielinghaus --- Pass split2 transforms (insn 218 222 114 15 (set (reg/v:TI 10 %r10 [orig:87 a ] [87]) (reg/v:TI 18 %f4 [orig:87 a ] [87])) 1466 {movti} (nil)) into (insn 234 222 235 15 (set (reg:DI 10 %r10 [ a ]) (reg:DI 18 %f4)) 1467 {*movdi_64} (nil)) (insn 235 234 114 15 (set (reg:DI 11 %r11 [orig:87 a+8 ] [87]) (unspec:DI [ (reg:V2DI 18 %f4) (const_int 1 [0x1]) ] UNSPEC_VEC_EXTRACT)) 495 {*vec_extractv2di} (nil)) which is then transformed by cprop_hardreg into (insn 234 222 235 14 (set (reg:DI 10 %r10 [ a ]) (reg:DI 11 %r11 [18])) 1467 {*movdi_64} (expr_list:REG_DEAD (reg:DI 11 %r11 [18]) (nil))) (insn 235 234 114 14 (set (reg:DI 11 %r11 [orig:87 a+8 ] [87]) (unspec:DI [ (reg:V2DI 18 %f4) (const_int 1 [0x1]) ] UNSPEC_VEC_EXTRACT)) 495 {*vec_extractv2di} (expr_list:REG_DEAD (reg:V2DI 18 %f4) (nil))) where in insn 234 register f4 is substituted by r11 which is wrong. This can also be observed in the final assembler output: vlvgp %v4,%r10,%r11 l %r2,12(%r1) ahi %r2,-1 st %r2,12(%r1) cijhe %r2,0,.L13 lgr %r10,%r11 // (*) vlgvg %r11,%v4,1 Registers r10 and r11 are moved into v4. The inverse move from v4 into r10 and r11 is broken since cprop_hardreg wrongly substitutes f4 by r11. Thus the expected output for (*) is: lgdr%r10,%f4
[Bug tree-optimization/101260] [10/11 Regression] Backport 27381e78925 to GCC 11
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101260 --- Comment #7 from Stefan Schulze Frielinghaus --- I had a look at the optimized tree output which looks good to me. However, I see that split2 transforms (insn 218 222 114 15 (set (reg/v:TI 10 %r10 [orig:87 a ] [87]) (reg/v:TI 18 %f4 [orig:87 a ] [87])) 1466 {movti} (nil)) into (insn 234 222 235 15 (set (reg:DI 10 %r10 [ a ]) (reg:DI 18 %f4)) 1467 {*movdi_64} (nil)) (insn 235 234 114 15 (set (reg:DI 11 %r11 [orig:87 a+8 ] [87]) (unspec:DI [ (reg:V2DI 18 %f4) (const_int 1 [0x1]) ] UNSPEC_VEC_EXTRACT)) 495 {*vec_extractv2di} (nil)) which might be wrong. If I swap r10 by r11 via diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md index 7faf775fbf2..0319934062a 100644 --- a/gcc/config/s390/s390.md +++ b/gcc/config/s390/s390.md @@ -1747,8 +1747,8 @@ (set (match_dup 3) (unspec:DI [(match_dup 5) (const_int 1)] UNSPEC_VEC_EXTRACT))] { - operands[2] = operand_subword (operands[0], 0, 0, TImode); - operands[3] = operand_subword (operands[0], 1, 0, TImode); + operands[2] = operand_subword (operands[0], 1, 0, TImode); + operands[3] = operand_subword (operands[0], 0, 0, TImode); operands[4] = gen_rtx_REG (DImode, REGNO (operands[1])); operands[5] = gen_rtx_REG (V2DImode, REGNO (operands[1])); }) then the compiled program just runs fine. However, I'm not sure whether this fixes the problem or just the symptoms. I will come back to this tomorrow.
[Bug tree-optimization/101260] [10/11 Regression] Backport 27381e78925 to GCC 11
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101260 --- Comment #5 from Stefan Schulze Frielinghaus --- Yes, I'm already looking into this.
[Bug tree-optimization/101260] Backport 27381e78925 to GCC 11
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101260 --- Comment #3 from Stefan Schulze Frielinghaus --- The problem shows up for option -O1 (options -O{0,2,3} are fine) and GCC 10 and 11 (mainline and GCC 9 are fine).
[Bug tree-optimization/101260] New: Backport 27381e78925 to GCC 11
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101260 Bug ID: 101260 Summary: Backport 27381e78925 to GCC 11 Product: gcc Version: 11.0 Status: UNCONFIRMED Keywords: wrong-code Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: stefansf at linux dot ibm.com Target Milestone: --- Target: s390*-*-* struct a { unsigned b : 7; int c; int d; short e; } p, *q = &p; int f, g, h, i, r, s; static short j[8][1][6] = {}; char k[7]; short l, m; int *n; int **o = &n; void t() { for (; f;) ; } static struct a u(int x) { struct a a = {4, 8, 5, 4}; for (; i <= 6; i++) { struct a v = {}; for (; l; l++) h = 0; for (; h >= 0; h--) { j[i]; struct a *w = &p; s = 0; for (; s < 3; s++) { r ^= x; m = j[i][g][h] == (k[g] = g); *w = v; } r = 2; for (; r; r--) *o = &r; } } t(); return a; } int main() { *q = u(636); if (p.b != 4) __builtin_abort (); } The reduced example runs fine if compiled with mainline (currently 53fd7544aff) whereas it fails if compiled with GCC 11 (currently f6306457ee3). The example runs fine with GCC 11, too, if commit d1d01a66012a93cc8cb7dafbe1b5ec453ec96b59 is cherry picked. Can we backport this one?
[Bug ipa/101066] [10/11/12 Regression] Wrong code after fixup_cfg3 since r10-3311-gff6686d2e5f797d6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101066 --- Comment #4 from Stefan Schulze Frielinghaus --- (In reply to Martin Jambor from comment #3) > I have proposed a fix on the mailing list: > https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573338.html I gave it a try on IBM Z where the testcase runs fine, now. Thanks!
[Bug analyzer/99212] [11 Regression] gcc.dg/analyzer/data-model-1.c line 971
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99212 --- Comment #20 from Stefan Schulze Frielinghaus --- The mentioned failing test cases are fixed on IBM Z, now. Thanks for your help!
[Bug analyzer/99212] [11 Regression] gcc.dg/analyzer/data-model-1.c line 971
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99212 Stefan Schulze Frielinghaus changed: What|Removed |Added CC||stefansf at linux dot ibm.com --- Comment #17 from Stefan Schulze Frielinghaus --- The new testcases introduced by commit d3b1ef7a83c fail on IBM Z as well as some older data-model tests: +FAIL: gcc.dg/analyzer/bitfields-1.c (test for warnings, line 113) +FAIL: gcc.dg/analyzer/bitfields-1.c (test for warnings, line 115) +FAIL: gcc.dg/analyzer/bitfields-1.c (test for warnings, line 117) +FAIL: gcc.dg/analyzer/bitfields-1.c (test for warnings, line 119) +FAIL: gcc.dg/analyzer/bitfields-1.c (test for warnings, line 121) +FAIL: gcc.dg/analyzer/bitfields-1.c (test for warnings, line 123) +FAIL: gcc.dg/analyzer/bitfields-1.c (test for warnings, line 125) +FAIL: gcc.dg/analyzer/bitfields-1.c (test for warnings, line 127) +FAIL: gcc.dg/analyzer/bitfields-1.c (test for warnings, line 24) +FAIL: gcc.dg/analyzer/bitfields-1.c (test for warnings, line 26) +FAIL: gcc.dg/analyzer/bitfields-1.c (test for warnings, line 29) +FAIL: gcc.dg/analyzer/bitfields-1.c (test for warnings, line 31) +FAIL: gcc.dg/analyzer/bitfields-1.c (test for warnings, line 36) +FAIL: gcc.dg/analyzer/bitfields-1.c (test for warnings, line 41) +FAIL: gcc.dg/analyzer/bitfields-1.c (test for warnings, line 81) +FAIL: gcc.dg/analyzer/bitfields-1.c (test for warnings, line 83) +FAIL: gcc.dg/analyzer/bitfields-1.c (test for warnings, line 85) +FAIL: gcc.dg/analyzer/bitfields-1.c (test for warnings, line 87) +FAIL: gcc.dg/analyzer/bitfields-1.c (test for warnings, line 92) +FAIL: gcc.dg/analyzer/bitfields-1.c (test for warnings, line 94) +FAIL: gcc.dg/analyzer/bitfields-1.c (test for warnings, line 96) +FAIL: gcc.dg/analyzer/bitfields-1.c (test for excess errors) +FAIL: gcc.dg/analyzer/data-model-1.c (test for warnings, line 947) +FAIL: gcc.dg/analyzer/data-model-1.c (test for warnings, line 950) +FAIL: gcc.dg/analyzer/data-model-1.c (test for warnings, line 965) +FAIL: gcc.dg/analyzer/data-model-1.c (test for warnings, line 968) +FAIL: gcc.dg/analyzer/data-model-1.c (test for excess errors) The actual warning for those failing tests is "UNKNOWN".
[Bug c/101066] New: Wrong code after fixup_cfg3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101066 Bug ID: 101066 Summary: Wrong code after fixup_cfg3 Product: gcc Version: 12.0 Status: UNCONFIRMED Keywords: wrong-code Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: stefansf at linux dot ibm.com Target Milestone: --- Target: s390*-*-*, x86_64-*-* int a = 1, c, d, e; int *b = &a; static int g(int *h) { c = *h; return d; } static void f(int *h) { e = *h; *b = 0; g(h); } int main() { f(b); printf("%d\n", c); } Running `gcc t.c -Os && ./a.out` results in printed 1 whereas 0 is expected. This does not happen for -O[0,1,2,3] i.e. there 0 is printed. Prior fixup_cfg3 the code looks good to me and afterwards the assignment to c uses a cached/initial value of variable a which is wrong: int main () { int * b.0_1; int c.1_2; int _6; int _7; int * b.2_8; int _10; int _11; [local count: 1073741824]: b.0_1 = b; _6 = *b.0_1; _7 = _6; e = _7; b.2_8 = b; *b.2_8 = 0; _10 = _6; c = _10; _11 = d; c.1_2 = c; printf ("%d\n", c.1_2); return 0; } Reproducible on IBM Z as well as x86_64 using commit 831589c227c.
[Bug debug/100960] New: var-tracking: parameter location in subregister not tracked
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100960 Bug ID: 100960 Summary: var-tracking: parameter location in subregister not tracked Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: debug Assignee: unassigned at gcc dot gnu.org Reporter: stefansf at linux dot ibm.com Target Milestone: --- Target: s390x-*-* Created attachment 50960 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50960&action=edit var-tracking dump On IBM Z we often have the case that debug information for a parameter points to the entry value only although the value is held in a register, too. __attribute__((noinline, noclone)) void f1 (int x) { __asm volatile ("" : "+r" (x) : : "memory"); } __attribute__((noinline, noclone)) int f2 (int x) { f1 (x); return x; // (*) } __attribute__((noinline, noclone)) int f3 (int x) { f2 (x); return 3; } int main () { f3 (42); return 0; } 0x1000600 stmg%r12,%r15,96(%r15) 0x1000606 lay %r15,-160(%r15) 0x100060clgr %r12,%r2 0x1000610brasl %r14,0x10005f8 0x1000616lgr %r2,%r12 0x100061almg %r12,%r15,256(%r15) 0x1000620br %r14 At program point (*) debug information for parameter x points to the entry value only. Thus it gets neglected that the value was moved to call-saved register r12 prior function call f1. Having a look at var-tracking this seems to boil down to the fact that register r2 is saved (lgr %r12,%r2) and restored (lgr %r2,%r12) in DI mode whereas parameter x has only SI mode and the relation is not tracked. In other words, for parameter x var-tracking is looking after the function call f1 for an SI value and doesn't find it although it is a subvalue held in register r12. Is this a known deficiency or am I missing something?
[Bug middle-end/100562] New: ICE after commit a076632e274abe344ca7648b7c7f299273d4cbe0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100562 Bug ID: 100562 Summary: ICE after commit a076632e274abe344ca7648b7c7f299273d4cbe0 Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: stefansf at linux dot ibm.com Target Milestone: --- Target: s390*-*-* Since commit g:a076632e274abe344ca7648b7c7f299273d4cbe0 building GCC with Go language enabled on IBM Z results in In function 'syscall.forkExec': go1: error: address taken, but ADDRESSABLE bit not set PHI argument &go..C479; for PHI node err$__object_78 = PHI during GIMPLE pass: fre go1: internal compiler error: verify_ssa failed 0x27e3349 verify_ssa(bool, bool) /home/stefansf/devel/gcc-3/src/gcc/tree-ssa.c:1214 0x21fee6f execute_function_todo /home/stefansf/devel/gcc-3/src/gcc/passes.c:2049 0x21fda15 do_per_function /home/stefansf/devel/gcc-3/src/gcc/passes.c:1687 0x21ff0a5 execute_todo /home/stefansf/devel/gcc-3/src/gcc/passes.c:2096 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions.
[Bug rtl-optimization/100263] [11/12 Regression] RTL optimizers miscompile loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100263 Stefan Schulze Frielinghaus changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #17 from Stefan Schulze Frielinghaus --- Closing since fixed in releases/gcc-{8,9,10,11} and mainline.
[Bug rtl-optimization/100263] [11/12 Regression] RTL optimizers miscompile loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100263 --- Comment #11 from Stefan Schulze Frielinghaus --- (In reply to Eric Botcazou from comment #10) > OK, then it's probably better to add it to: > > if (!is_a (reg_mode[regno], &old_mode) > || !MODES_OK_FOR_MOVE2ADD (mode, old_mode)) > return false; Ok, I will move the check up there. Currently running bootstrap+regtest on x86 as well as IBM Z.
[Bug rtl-optimization/100263] [11/12 Regression] RTL optimizers miscompile loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100263 --- Comment #9 from Stefan Schulze Frielinghaus --- Shouldn't we rather check for REG_CAN_CHANGE_MODE_P? A check for TARGET_HARD_REGNO_MODE_OK for a FP register and QImode is successful. Using the following also fixes the test for me: diff --git a/gcc/postreload.c b/gcc/postreload.c index dc67643384d..3dccbe63cf4 100644 --- a/gcc/postreload.c +++ b/gcc/postreload.c @@ -1733,7 +1733,7 @@ move2add_valid_value_p (int regno, scalar_int_mode mode) regno of the lowpart might be different. */ poly_int64 s_off = subreg_lowpart_offset (mode, old_mode); s_off = subreg_regno_offset (regno, old_mode, s_off, mode); - if (maybe_ne (s_off, 0)) + if (maybe_ne (s_off, 0) || !REG_CAN_CHANGE_MODE_P (regno, old_mode, mode)) /* We could in principle adjust regno, check reg_mode[regno] to be BLKmode, and return s_off to the caller (vs. -1 for failure), but we currently have no callers that could make use of this
[Bug rtl-optimization/100263] [11/12 Regression] RTL optimizers miscompile loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100263 --- Comment #6 from Stefan Schulze Frielinghaus --- Prior postreload we have (insn 12 379 332 3 (set (reg:QI 17 %f2 [orig:198 l_lsm_flag.27 ] [198]) (const_int 1 [0x1])) 1480 {*movqi} (expr_list:REG_EQUIV (const_int 1 [0x1]) (nil))) which gets substituted during postreload by (insn 12 379 332 3 (set (reg:QI 17 %f2 [orig:198 l_lsm_flag.27 ] [198]) (reg:QI 17 %f2 [orig:198 l_lsm_flag.27 ] [198])) 1480 {*movqi} (expr_list:REG_EQUIV (const_int 1 [0x1]) (nil))) which gets deleted during split2 where we have deleting insn with uid = 12. The culprit seems to be that postreload changes the RHS of the assignment to `reg:QI 17 %f2` which is wrong. Register %f2 holds indeed constant 1, however, in DImode which is not compatible to QImode on IBM Z. The decision takes place in function move2add_valid_value_p where the wrong offset is returned by call subreg_regno_offset (17, DImode, 7, QImode) I would have expected offset 7 but 0 is returned which renders the subsequent if-condition false. The 0 comes from function call subreg_get_info which returns in this case via /* Lowpart subregs are otherwise valid. */ if (!rknown && known_eq (offset, subreg_lowpart_offset (ymode, xmode))) { info->representable_p = true; rknown = true; if (known_eq (offset, 0U) || nregs_xmode == nregs_ymode) { info->offset = 0; info->nregs = nregs_ymode; return; } } The offset doesn't equal zero but the number of registers. Still the offset is set to zero. I did a quick test by using instead diff --git a/gcc/postreload.c b/gcc/postreload.c index dc67643384d..64297be2c45 100644 --- a/gcc/postreload.c +++ b/gcc/postreload.c @@ -1732,12 +1732,7 @@ move2add_valid_value_p (int regno, scalar_int_mode mode) (REG:reg_mode[regno] regno). Now, for big endian, the starting regno of the lowpart might be different. */ poly_int64 s_off = subreg_lowpart_offset (mode, old_mode); - s_off = subreg_regno_offset (regno, old_mode, s_off, mode); - if (maybe_ne (s_off, 0)) - /* We could in principle adjust regno, check reg_mode[regno] to be - BLKmode, and return s_off to the caller (vs. -1 for failure), - but we currently have no callers that could make use of this - information. */ + if (simplify_subreg_regno (regno, old_mode, s_off, mode) < 0) return false; } which works at least for the example (haven't done a bootstrap nor regtest yet). However, I'm still wondering whether subreg_get_info is supposed to return with a zero offset in cases like this? Any thoughts?
[Bug rtl-optimization/100263] [11/12 Regression] RTL optimizers miscompile loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100263 --- Comment #5 from Stefan Schulze Frielinghaus --- It looks like a mode mismatch: (insn 201 200 378 3 (set (reg:DI 17 %f2 [196]) (const_int 1 [0x1])) "t.c":23:36 1467 {*movdi_64} (expr_list:REG_EQUIV (const_int 1 [0x1]) (nil))) ... (insn 312 44 313 4 (set (reg:QI 5 %r5 [orig:74 c__lsm_flag.21 ] [74]) (reg:QI 17 %f2 [orig:198 l_lsm_flag.27 ] [198])) "t.c":13:14 1480 {*movqi} (nil)) ... (insn 245 244 246 41 (set (reg:SI 5 %r5 [orig:169 c__lsm_flag.21+-3 ] [169]) (zero_extend:SI (reg:QI 5 %r5 [orig:74 c__lsm_flag.21 ] [74]))) 1652 {*zero_extendqisi2_extimm} (nil)) (note 246 245 247 41 NOTE_INSN_DELETED) (jump_insn 247 246 248 41 (parallel [ (set (pc) (if_then_else (eq (reg:SI 5 %r5 [orig:169 c__lsm_flag.21+-3 ] [169]) (const_int 0 [0])) (label_ref 251) (pc))) (clobber (reg:CC 33 %cc)) ]) 1458 {*cmp_and_br_signed_si} (expr_list:REG_DEAD (reg:SI 5 %r5 [orig:169 c__lsm_flag.21+-3 ] [169]) (expr_list:REG_UNUSED (reg:CC 33 %cc) (int_list:REG_BR_PROB 357913950 (nil -> 251) (note 248 247 249 42 [bb 42] NOTE_INSN_BASIC_BLOCK) (insn 249 248 250 42 (set (reg/f:DI 1 %r1 [170]) (symbol_ref:DI ("*.LANCHOR0") [flags 0x182])) 1467 {*movdi_64} (expr_list:REG_EQUIV (symbol_ref:DI ("*.LANCHOR0") [flags 0x182]) (nil))) (insn 250 249 251 42 (set (mem/c:QI (plus:DI (reg/f:DI 1 %r1 [170]) (const_int 135 [0x87])) [0 MEM[(char *)&c + 107B]+0 S1 A8]) (reg:QI 18 %f4 [orig:73 D.2339 ] [73])) 1480 {*movqi} (expr_list:REG_DEAD (reg:QI 18 %f4 [orig:73 D.2339 ] [73]) (expr_list:REG_DEAD (reg/f:DI 1 %r1 [170]) (nil Register f2 is written to in DI mode and read from in QI mode. The final assembler for the read is `vlgvb %r5,%v2,0`. Inspecting v2 via GDB we have: v2_int64 = {0x1, 0x0} which means r5 is zero afterwards and therefore the condition r5==0 is always true so the store from insn 250 never happens.
[Bug rtl-optimization/100263] [11/12 Regression] RTL optimizers miscompile loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100263 --- Comment #4 from Stefan Schulze Frielinghaus --- You are right. I got lured by the fact that the assignments c__lsm.20_94 = 1; and c__lsm_flag.21_95 = 1; of bb5 are "moved" into the PHI as e.g. # c__lsm.20_51 = PHI # c__lsm_flag.21_53 = PHI I will have a look at the RTL output then.
[Bug tree-optimization/100263] New: Wrong removal of statement in copyprop3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100263 Bug ID: 100263 Summary: Wrong removal of statement in copyprop3 Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: stefansf at linux dot ibm.com Target Milestone: --- Target: s390*-*-* Created attachment 50676 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50676&action=edit copyprop3 int a = 3, d, l, o, p, q; unsigned char b, f, g; unsigned char c[9][9][3]; long e; unsigned h; short j, k; static unsigned char r(int s) { for (; b-2; b=b-2) { int *m = &d; if (a) { char *n = &c[3][8][2]; *n = 1; for (; h;) for (; g;) for (; e;) ; } else { int i = 0; for (; i < 2; i++) if (*m) return 0; if (s) l = k |= ((j-- != b) <= s) - (long)s; else return f; } } return 0; } int main() { r(b); if (c[3][8][2] != 1) __builtin_abort (); } The outermost loop is executed 127 times. Since variable `a` does not change from its initial value 3, the store to `c[3][8][2]` must materialize since the infinite loops over variables h, g, e are never executed. However, running `gcc -march=z13 t.c -O1 && ./a.out` results in an abort. Runs are successfull if using different optimization levels than 1. Prior to copyprop3 we have c__lsm.20_74 = MEM[(char *)&c + 107B]; if (a.1_7 != 0) goto ; [50.00%] else goto ; [50.00%] [local count: 5160584476]: c__lsm.20_94 = 1; c__lsm_flag.21_95 = 1; whereas afterwards the store to `c__lsm.20_94` is removed. Dump file contains: Removing dead stmt c__lsm.20_94 = 1; In total we have that the only store to `c[3][8][2]` happens in inner loop over variable `h` which is never executed. There we have `MEM[(char *)&c + 107B] = 1;`. Tested against g:3971aee9dd8d6323c377d1b241173f7d2b51a835 on IBM Z where it fails. Runs successfully on x86 (by chance?). Bisection stops at g:f9e1ea10e657af9fb02fafecf1a600740fd34409 which might not be the root cause.
[Bug sanitizer/99814] regexec fails with -fsanitize=address
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99814 --- Comment #4 from Stefan Schulze Frielinghaus --- Thanks for the pointers! I reported it upstream in issue [1390](https://github.com/google/sanitizers/issues/1390)
[Bug sanitizer/99814] regexec fails with -fsanitize=address
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99814 --- Comment #2 from Stefan Schulze Frielinghaus --- Breakpoint 4, __interception::InterceptFunction (name=0x3fffd61e8f2 "regexec", ver=0x3fffd61eb7e "GLIBC_2.3.4", ptr_to_real=0x3fffd677d08 <__interception::real_regexec>, func=16779728, wrapper=4398001883504) at /devel/gcc-4/src/libsanitizer/interception/interception_linux.cpp:74 74void *addr = GetFuncAddr(name, ver); At the end of InterceptFunction we have: (gdb) print addr $1 = (void *) 0x3fffd2e9110 <__GI___regexec> The address itself also LGTM, i.e., `readelf -s /lib64/libc.so.6 | grep regexec` results in: 279: 000e9110 344 FUNCGLOBAL DEFAULT 13 regexec@@GLIBC_2.3.4 ... 25156: 000e9110 344 FUNCLOCAL DEFAULT 13 __GI___regexec However, variables func and wrapper differ (gdb) print func $2 = 16779728 (gdb) print wrapper $3 = 4398001883504 so we return false.
[Bug sanitizer/99814] New: regexec fails with -fsanitize=address
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99814 Bug ID: 99814 Summary: regexec fails with -fsanitize=address Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: sanitizer Assignee: unassigned at gcc dot gnu.org Reporter: stefansf at linux dot ibm.com CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org, jakub at gcc dot gnu.org, kcc at gcc dot gnu.org, marxin at gcc dot gnu.org Target Milestone: --- Target: s390x Testing against today's commit https://gcc.gnu.org/g:d579e2e76f9469e1b386d693af57c5c4f0ede410 on s390x we have: $ gcc pr98920.c -fsanitize=address && ./a.out failed to match The testcase succeeds without `-fsanitize=address`. In GDB I see that the address loaded from _ZN14__interception12real_regexecE equals the address of regexec@GLIBC_2.2 which explains why the testcase fails. Without `-fsanitize=address` function regexec@@GLIBC_2.3.4 is executed.
[Bug preprocessor/99313] New: ICE while changing global target options via pragma
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99313 Bug ID: 99313 Summary: ICE while changing global target options via pragma Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: preprocessor Assignee: unassigned at gcc dot gnu.org Reporter: stefansf at linux dot ibm.com Target Milestone: --- #pragma GCC push_options #pragma GCC target ("arch=z13") #pragma GCC pop_options $ gcc t.c -c -march=z900 test.c:3:9: internal compiler error: 'global_options' are modified in local context 3 | #pragma GCC pop_options | ^~~ 0x20035cf cl_optimization_compare(gcc_options*, gcc_options*) /devel/build/gcc/options-save.c:12836 0x1720b4b handle_pragma_pop_options /devel/src/gcc/c-family/c-pragma.c:1092 0x17218a5 c_invoke_pragma_handler(unsigned int) /devel/src/gcc/c-family/c-pragma.c:1515 0x1636ed7 c_parser_pragma /devel/src/gcc/c/c-parser.c:12519 0x1617165 c_parser_external_declaration /devel/src/gcc/c/c-parser.c:1758 0x1616c51 c_parser_translation_unit /devel/src/gcc/c/c-parser.c:1650 0x1660487 c_parse_file() /devel/src/gcc/c/c-parser.c:21984 0x1718aab c_common_parse_file() /devel/src/gcc/c-family/c-opts.c:1218 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. Started with dc6d15eaa23cbae1468a6ef92371b1c856c14819
[Bug tree-optimization/99253] [10 Regression] tree-vect-loop wrong code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99253 Stefan Schulze Frielinghaus changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #5 from Stefan Schulze Frielinghaus --- Can confirm, fixed on IBM Z. Thanks!
[Bug tree-optimization/99253] [10/11 Regression] tree-vect-loop wrong code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99253 --- Comment #2 from Stefan Schulze Frielinghaus --- Still aborts with -fno-vect-cost-model on IBM Z.
[Bug tree-optimization/99253] New: tree-vect-loop wrong code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99253 Bug ID: 99253 Summary: tree-vect-loop wrong code Product: gcc Version: 11.0 Status: UNCONFIRMED Keywords: wrong-code Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: stefansf at linux dot ibm.com Target Milestone: --- Target: s390x, x86_64-*-* int a = 0; static int b = 0; long c = 0; int main() { for (int d = 0; d < 8; d++) { a ^= c; b = a; a ^= 1; } if (b != 1) __builtin_abort(); return 0; } Aborts when built with: gcc -O3 t.c Bisection stops at commit 04bff1bbfc11a974342c0eb0c0d65d902e36e82e on IBM Z and at commit b7ff7cef5005721e78d6936bed3ae1c059b4e8d2 on x86 ¯\_(ツ)_/¯
[Bug rtl-optimization/99221] copyprop_hardreg_forward_1 deletes insn by mistake
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99221 --- Comment #1 from Stefan Schulze Frielinghaus --- Created attachment 50243 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50243&action=edit a-foo.i.307r.cprop_hardreg
[Bug rtl-optimization/99221] New: copyprop_hardreg_forward_1 deletes insn by mistake
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99221 Bug ID: 99221 Summary: copyprop_hardreg_forward_1 deletes insn by mistake Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: stefansf at linux dot ibm.com Target Milestone: --- Created attachment 50242 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50242&action=edit a-foo.i.301r.jump2 Consider the following reduced example: int g = 9, h, i, b; char j, k; long m; char *a = &j; char n() { return 0; } void o(short o) {} short p(short o, short q) { return q; } long r(long s) { return s; } long t(long s) { return 0; } long u(long s) { return 0; } short v() { return 0; } void x(int **s, unsigned q) { short c = 0, d = 0; int e = 4; if (g) { int f = 4; q = 1; for (; q <= 4; q++) { m = q; d = i = u(d); if (0 == r(1)) { if (u(0)) { h = t(q) != **s; for (; e; e = f &= 0 <= 0) ; } } else { b = c; c = v(); o(q); k = n(); j = p(k || q, q); } } } } int main() { x(0, 0); printf("%d\n", j); return 0; } Command line: gcc -march=arch11 -w -Og foo.i && ./a.out Expected output: 4 Actual output: 0 On IBM Z we have prior pass cprop_hardreg the following insns of interest: (insn 24 23 25 4 (set (reg:DI 24 %f8 [orig:61 _2 ] [61]) (reg/v:DI 12 %r12 [orig:85 q+-4 ] [85])) "./foo.i":19:10 1462 {*movdi_64} (nil)) ... (insn 80 79 81 10 (set (reg:HI 24 %f8 [orig:74 _15 ] [74]) (reg:HI 12 %r12 [orig:85 q+2 ] [85])) "./foo.i":30:10 1474 {*movhi} (nil)) ... (insn 155 96 97 15 (set (reg:HI 1 %r1 [orig:74 _15 ] [74]) (reg:HI 24 %f8 [orig:74 _15 ] [74])) "./foo.i":32:14 1474 {*movhi} (nil)) Register f8 is set to a 64-bit value in insn 24 and to a 16-bit value in insn 80, respectively, while using the same source register r12. During copyprop_hardreg_forward_1 it is then wrongly detected that insn 80 is a noop set and is subsequently removed. Due to different alignments of different modes in FPRs we have that in insn 155 the wrong part of register f8 is then accessed which results in constant value zero. In order to decide whether an insn is a noop set or not I gave it a try by additionally testing whether the prior set and the current are done in compatible modes by asking the backend: diff --git a/gcc/regcprop.c b/gcc/regcprop.c index e1342f56bd1..02753a12510 100644 --- a/gcc/regcprop.c +++ b/gcc/regcprop.c @@ -474,7 +474,8 @@ find_oldest_value_reg (enum reg_class cl, rtx reg, struct value_data *vd) (set (...) (reg:DI r9)) Replacing r9 with r11 is invalid. */ if (mode != vd->e[regno].mode - && REG_NREGS (reg) > hard_regno_nregs (regno, vd->e[regno].mode)) + && (REG_NREGS (reg) > hard_regno_nregs (regno, vd->e[regno].mode) + || !REG_CAN_CHANGE_MODE_P (regno, mode, vd->e[regno].mode))) return NULL_RTX; for (i = vd->e[regno].oldest_regno; i != regno; i = vd->e[i].next_regno) Any thoughts about this fix?
[Bug tree-optimization/98094] ICE in decompose, at wide-int.h:984
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98094 Stefan Schulze Frielinghaus changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED |RESOLVED --- Comment #6 from Stefan Schulze Frielinghaus --- Ah yes commit c961e94901eb793b1a18d431a1acf7f682eaf04f seems to have fixed this. Closing since fixed. Thanks for your help!
[Bug tree-optimization/98094] ICE in decompose, at wide-int.h:984
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98094 --- Comment #3 from Stefan Schulze Frielinghaus --- I still run into the same error with e4c02ce4ab6fce1148f4025360096f18764deadf
[Bug tree-optimization/98094] ICE in decompose, at wide-int.h:984
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98094 --- Comment #1 from Stefan Schulze Frielinghaus --- Reduced program: struct { unsigned a : 10 } b; c; d() { c = b.a; if (c == 8 || c == 0) ; else if (c > 8 * 8) ; else if (c < 8 * 8) e(); }
[Bug tree-optimization/98094] New: ICE in decompose, at wide-int.h:984
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98094 Bug ID: 98094 Summary: ICE in decompose, at wide-int.h:984 Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: stefansf at linux dot ibm.com Target Milestone: --- Compiling SPEC benchmark 502.gcc_r on S/390 results in the following ICE: $ /devel/gcc-2/dst/bin/gcc -c -o tree.o -DSPEC -DNDEBUG -I. -I./include -I./spec_qsort -DSPEC_502 -DSPEC_AUTO_SUPPRESS_OPENMP -DIN_GCC -DHAVE_CONFIG_H -march=arch13 -O3 -std=gnu89 -DSPEC_LP64 tree.c during GIMPLE pass: iftoswitch tree.c: In function 'tree_floor_log2': tree.c:10732: internal compiler error: in decompose, at wide-int.h:984 0x119ce51 wi::int_traits > >::decompose(long*, unsigned int, generic_wide_int > const&) /devel/gcc-2/src/gcc/wide-int.h:984 0x1a44837 wi::int_traits > >::decompose(long*, unsigned int, generic_wide_int > const&) /devel/gcc-2/src/gcc/tree.h:3445 0x1a44837 wide_int_ref_storage::wide_int_ref_storage > >(generic_wide_int > const&, unsig ned int) /devel/gcc-2/src/gcc/wide-int.h:1034 0x1a44837 generic_wide_int >::generic_wide_int > >(generic_wide_int > const&, unsigned int) /devel/gcc-2/src/gcc/wide-int.h:790 0x1a44837 wi::binary_traits >, generic_wide_int >, wi::int_traits > >::precision_type, wi::int_traits > >::precision_type>::result_type wi::sub >, generic_wide_int > >(generic_wide_int > const&, generic_wide_int > const&) /devel/gcc-2/src/gcc/wide-int.h:2513 0x1a44837 wi::binary_traits >, generic_wide_int >, wi::int_traits > >::precision_type, wi::int_traits > >::precision_type>::operator_result operator- >, generic_wide_int > >(generic_wide_int > const&, generic_wide_int > co nst&) /devel/gcc-2/src/gcc/wide-int.h:3297 0x1a44837 tree_switch_conversion::cluster::get_range(tree_node*, tree_node*) /devel/gcc-2/src/gcc/tree-switch-conversion.h:87 0x1a3771d tree_switch_conversion::jump_table_cluster::can_be_handled(vec const&, unsigned int, unsigned int) /devel/gcc-2/src/gcc/tree-switch-conversion.c:1265 0x1a3d8d5 tree_switch_conversion::jump_table_cluster::can_be_handled(vec const&, unsigned int, unsigned int) /devel/gcc-2/src/gcc/tree-switch-conversion.c:1258 0x1a3d8d5 tree_switch_conversion::jump_table_cluster::find_jump_tables(vec&) /devel/gcc-2/src/gcc/tree-switch-conversion.c:1201 0x1a3d8d5 tree_switch_conversion::jump_table_cluster::find_jump_tables(vec&) /devel/gcc-2/src/gcc/tree-switch-conversion.c:1175 0x22876d9 if_chain::is_beneficial() /devel/gcc-2/src/gcc/gimple-if-to-switch.cc:244 0x2289237 execute /devel/gcc-2/src/gcc/gimple-if-to-switch.cc:530 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. Bisect stops at 03eb09292ef228d1d12b5168cdd748583b1f992a
[Bug tree-optimization/97545] New: ICE since commit 90e88fd376b and using selective-scheduling2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97545 Bug ID: 97545 Summary: ICE since commit 90e88fd376b and using selective-scheduling2 Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: stefansf at linux dot ibm.com Target Milestone: --- Created attachment 49433 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49433&action=edit reduced failing example Since commit 90e88fd376b compiling the attached program on S/390 results in: $ gcc -O3 -fselective-scheduling2 t.i during RTL pass: sched2 : In function 'main': :67:1: internal compiler error: Segmentation fault 0x21e3323 crash_signal /home/stefansf/devel/gcc-2/src/gcc/toplev.c:330 0x171da48 NEXT_INSN(rtx_insn const*) /home/stefansf/devel/gcc-2/src/gcc/rtl.h:1469 0x28be551 s390_sched_init /home/stefansf/devel/gcc-2/src/gcc/config/s390/s390.c:15129 0x213c1f7 sel_region_init /home/stefansf/devel/gcc-2/src/gcc/sel-sched.c:6929 0x213e5d7 sel_sched_region(int) /home/stefansf/devel/gcc-2/src/gcc/sel-sched.c:7624 0x213e853 run_selective_scheduling() /home/stefansf/devel/gcc-2/src/gcc/sel-sched.c:7720 0x2104d47 rest_of_handle_sched2 /home/stefansf/devel/gcc-2/src/gcc/sched-rgn.c:3738 0x21050b1 execute /home/stefansf/devel/gcc-2/src/gcc/sched-rgn.c:3882 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. whereas gcc -O3 -fselective-scheduling2 t.i -fevrp-mode=legacy works fine. It looks like as if current_sched_info->prev_head gets corrupted at some point. Adding a breakpoint prior the ICE and then trying to debug print results in: (gdb) call debug (current_sched_info->prev_head) (??? bad code 42405 )
[Bug ada/97504] [11 Regression] Ada bootstrap error after r11-4029
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97504 --- Comment #8 from Stefan Schulze Frielinghaus --- (In reply to Alexandre Oliva from comment #5) > Created attachment 49427 [details] > patch that should fix the remaining s390 problem > > So, the issue is already fixed on aarch64-*, powerpc*-*, and > sparc*-sun-solaris*. > > Stefan, could you possibly confirm that this patch fixes it on s390? Build is successful on s390. Testsuite still running.
[Bug ada/97504] [11 Regression] Ada bootstrap error after r11-4029
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97504 Stefan Schulze Frielinghaus changed: What|Removed |Added CC||stefansf at linux dot ibm.com --- Comment #2 from Stefan Schulze Frielinghaus --- For the sake of completeness, this also fails on S/390.
[Bug tree-optimization/97152] New: Wrong code generation since commit b6ff3ddecfa
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97152 Bug ID: 97152 Summary: Wrong code generation since commit b6ff3ddecfa Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: stefansf at linux dot ibm.com Target Milestone: --- Since commit b6ff3ddecfa the following program prints the wrong number to stdout: int a = 0, j = 0; int *b = 0; int **c = &b, **i = 0; unsigned d = 0; unsigned short e = 4; long f = 0; char g = 0, k = 99; static unsigned *dPTR = &d; void l() { for (;;) { int m = 0; for (g = 0; g == 0; g++) { unsigned short *n = &e; *c = &j; if ((*n)++ == 0) return; for (d = 3; d <= 8; d++) ; *dPTR = 0; } for (; a != 0;) for (;;) for (; f != 0;) *i = &m; } } int main() { l(); printf("%d\n", d); return d; } The inner loop always executes 1 time. The expression (*n)++ evaluates after MAX_USHORT-4 times to zero which renders the if-condition true and the function call to l returns. Prior that via *dPTR = 0 variable d gets set to zero. Thus expected is 0 but 9 is printed. Any idea what went wrong?