[Bug libstdc++/88947] regex_match doesn't fail early when given a non-matching pattern with a start-of-input anchor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88947 --- Comment #6 from Tim Shen --- (In reply to Tomalak Geret'kal from comment #4) > To be honest I'd expect this in less trivial circumstances too. If, at a > given stage of processing, the only possible paths towards a match all > require a prefix that's already been ruled out, that should be an immediate > return false. Thinking about this more, I think it's also easy to support the following case: regex_search(..., regex("{arbitrary_literal_string}...") where {arbitrary_literal_string} is a literal string, without other regex magic like "|" or "*". The literal prefix should be passed into a substring search. For implemention: The new algorithm for regex_search() would be: (1) prefix = find the longest deterministic prefix of the regex (2) pos = find the first occurance of the prefix in the target. If it fails, return false. (3) target = target[pos+prefix.size():] (4) try match target from start (as it's currently done) (5) if (4) fails, go to (2). (1) can be done at regex-compiling time. It'd require a new data member thus non-ABI compatible. (1) can also be done at matching time. Either way, we are looking for the longest sequence of literals from start. It stops with any of "*", "?", "(" or any other magic. (2) can be as plain as a substring search, but one might prefer the O(n) KMP algorithm if we happen to have one.
[Bug target/88938] ICE in extract_insn, at recog.c:2304
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88938 --- Comment #5 from uros at gcc dot gnu.org --- Author: uros Date: Tue Jan 22 16:35:53 2019 New Revision: 268157 URL: https://gcc.gnu.org/viewcvs?rev=268157=gcc=rev Log: PR target/88938 * config/i386/i386.c (ix86_expand_builtin) [case IX86_BUILTIN_BEXTRI32, case IX86_BUILTIN_BEXTRI64]: Sanitize operands. testsuite/ChangeLog: PR target/88938 * gcc.target/i386/pr88938.c: New test. Added: branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr88938.c Modified: branches/gcc-7-branch/gcc/ChangeLog branches/gcc-7-branch/gcc/config/i386/i386.c branches/gcc-7-branch/gcc/testsuite/ChangeLog
[Bug target/88965] powerpc64le vector builtin hits ICE in verify_gimple
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88965 --- Comment #6 from Segher Boessenkool --- That patch looks good, and is pre-approved. Thanks!
[Bug target/88938] ICE in extract_insn, at recog.c:2304
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88938 --- Comment #4 from uros at gcc dot gnu.org --- Author: uros Date: Tue Jan 22 16:32:47 2019 New Revision: 268156 URL: https://gcc.gnu.org/viewcvs?rev=268156=gcc=rev Log: PR target/88938 * config/i386/i386.c (ix86_expand_builtin) [case IX86_BUILTIN_BEXTRI32, case IX86_BUILTIN_BEXTRI64]: Sanitize operands. testsuite/ChangeLog: PR target/88938 * gcc.target/i386/pr88938.c: New test. Added: branches/gcc-8-branch/gcc/testsuite/gcc.target/i386/pr88938.c Modified: branches/gcc-8-branch/gcc/ChangeLog branches/gcc-8-branch/gcc/config/i386/i386.c branches/gcc-8-branch/gcc/testsuite/ChangeLog
[Bug target/87064] [9 regression] libgomp.oacc-fortran/reduction-3.f90 fails starting with r263751
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87064 --- Comment #23 from Jakub Jelinek --- Created attachment 45496 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45496=edit gcc9-pr87064.patch Patch I've so far tested on powerpc64le-linux only, where it fixed FAIL: libgomp.oacc-fortran/reduction-3.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -O1 execution test and didn't regress anything else. I can bootstrap/regtest even on powerpc64-linux (though I believe it is pointless, given that I know from the earlier statistics gathering that the pattern is never used on powerpc64-linux during bootstrap nor -m32/-m64 regtest). So, I'll post to gcc-patches. The v4sf_scalar I'll leave to you, ok?
[Bug c++/88983] New: ICE in label_matches, at cp/constexpr.c:4035
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88983 Bug ID: 88983 Summary: ICE in label_matches, at cp/constexpr.c:4035 Product: gcc Version: 6.0 Status: UNCONFIRMED Keywords: ice-on-valid-code Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: asolokha at gmx dot com Target Milestone: --- g++-9.0.0-alpha20190120 snapshot (r268107), 8.2, 7.4, 6.3 ICE when compiling the following testcase reduced from test/SemaCXX/constant-expression-cxx1y.cpp from the clang 7.0.1 testsuite: constexpr int ni (int ay) { switch (ay) { if (1) { case 1: return 1; } else { default: ; } } return 0; } static_assert (ni (1), ""); % g++-9.0.0-alpha20190120 -c xd96vus4.cpp xd96vus4.cpp:21:19: in 'constexpr' expansion of 'ni(1)' xd96vus4.cpp:21:27: internal compiler error: in label_matches, at cp/constexpr.c:4035 21 | static_assert (ni (1), ""); | ^ 0x5cde7e label_matches /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/constexpr.c:4035 0x8cc336 cxx_eval_constant_expression /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/constexpr.c:4223 0x8cc53c cxx_eval_constant_expression /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/constexpr.c:4484 0x8ccd7f cxx_eval_switch_expr /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/constexpr.c:4148 0x8ccd7f cxx_eval_constant_expression /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/constexpr.c:4938 0x8cce62 cxx_eval_statement_list /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/constexpr.c:4073 0x8cce62 cxx_eval_constant_expression /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/constexpr.c:4836 0x8cba23 cxx_eval_call_expression /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/constexpr.c:1799 0x8cc800 cxx_eval_constant_expression /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/constexpr.c:4357 0x8d2672 cxx_eval_outermost_constant_expr /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/constexpr.c:5095 0x8d2dc8 maybe_constant_value(tree_node*, tree_node*, bool) /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/constexpr.c:5327 0x8e5265 cp_fully_fold(tree_node*) /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/cp-gimplify.c:2163 0xa63fa4 cp_build_binary_op(op_location_t const&, tree_code, tree_node*, tree_node*, int) /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/typeck.c:5538 0xa66aac build_binary_op(unsigned int, tree_code, tree_node*, tree_node*, bool) /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/typeck.c:4246 0xa66b2b cp_truthvalue_conversion(tree_node*) /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/typeck.c:5864 0x8e9d8d ocp_convert(tree_node*, tree_node*, int, int, int) /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/cvt.c:844 0x8eb37d cp_convert(tree_node*, tree_node*, int) /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/cvt.c:637 0x8eb37d cp_convert_and_check(tree_node*, tree_node*, int) /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/cvt.c:656 0x896934 convert_like_real /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/call.c:7327 0x897af0 perform_implicit_conversion_flags(tree_node*, tree_node*, int, int) /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/call.c:11043
[Bug c++/88984] New: [9 Regression] ICE in genericize_switch_stmt, at cp/cp-gimplify.c:377
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88984 Bug ID: 88984 Summary: [9 Regression] ICE in genericize_switch_stmt, at cp/cp-gimplify.c:377 Product: gcc Version: 9.0 Status: UNCONFIRMED Keywords: ice-on-valid-code Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: asolokha at gmx dot com Target Milestone: --- g++-9.0.0-alpha20190120 snapshot (r268107) ICEs when compiling the following testcase extracted from test/Sema/loop-control.c from the clang 7.0.1 testsuite: void pr8880_18(int x, int y) { while(x > 0) switch(({if(y) break; y;})) { case 2: x = 0; } } % g++-9.0.0-alpha20190120 -c rqxhnkhh.c rqxhnkhh.c: In function 'void pr8880_18(int, int)': rqxhnkhh.c:6:1: internal compiler error: in genericize_switch_stmt, at cp/cp-gimplify.c:377 6 | } | ^ 0x5d9807 genericize_switch_stmt /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/cp-gimplify.c:377 0x5d9807 cp_genericize_r /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/cp-gimplify.c:1505 0x1273013 walk_tree_1(tree_node**, tree_node* (*)(tree_node**, int*, void*), void*, hash_set >*, tree_node* (*)(tree_node**, int*, tree_node* (*)(tree_node**, int*, void*), void*, hash_set >*)) /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/tree.c:12064 0x8dc138 genericize_cp_loop /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/cp-gimplify.c:251 0x8df2ca genericize_do_stmt /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/cp-gimplify.c:346 0x8df2ca cp_genericize_r /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/cp-gimplify.c:1501 0x1273013 walk_tree_1(tree_node**, tree_node* (*)(tree_node**, int*, void*), void*, hash_set >*, tree_node* (*)(tree_node**, int*, tree_node* (*)(tree_node**, int*, void*), void*, hash_set >*)) /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/tree.c:12064 0x8e0bd2 cp_genericize_tree /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/cp-gimplify.c:1629 0x8e0f94 cp_genericize(tree_node*) /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/cp-gimplify.c:1778 0x91a65d finish_function(bool) /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/decl.c:16183 0x9bb2f8 cp_parser_function_definition_after_declarator /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/parser.c:27633 0x9bc0dc cp_parser_function_definition_from_specifiers_and_declarator /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/parser.c:27546 0x9bc0dc cp_parser_init_declarator /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/parser.c:20205 0x99d398 cp_parser_simple_declaration /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/parser.c:13476 0x9c2c10 cp_parser_declaration /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/parser.c:13173 0x9c33a0 cp_parser_translation_unit /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/parser.c:4698 0x9c33a0 c_parse_file() /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/parser.c:41003 0xaccdab c_common_parse_file() /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/c-family/c-opts.c:1155
[Bug c++/88982] New: ICE in tsubst_pack_expansion, at cp/pt.c:12221
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88982 Bug ID: 88982 Summary: ICE in tsubst_pack_expansion, at cp/pt.c:12221 Product: gcc Version: 8.0 Status: UNCONFIRMED Keywords: ice-on-valid-code Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: asolokha at gmx dot com Target Milestone: --- g++-9.0.0-alpha20190120 snapshot (r268107) ICE when compiling the following testcase reduced from test/CXX/temp/temp.param/p15-cxx0x.cpp from the clang 7.0.1 testsuite: template struct A { template class ...Cs, Cs ...Vs> struct B { B() { } }; }; template using Int = int; template using Char = char; A::B b; % g++-9.0.0-alpha20190120 -c rtzrooh4.cpp rtzrooh4.cpp:10:27: internal compiler error: in tsubst_pack_expansion, at cp/pt.c:12221 10 | A::B b; | ^ 0x63d616 tsubst_pack_expansion(tree_node*, tree_node*, int, tree_node*) /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/pt.c:12221 0x9ef5bb coerce_template_parameter_pack /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/pt.c:8121 0x9ef5bb coerce_template_parms /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/pt.c:8411 0x9f08a1 coerce_innermost_template_parms /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/pt.c:8619 0x9fcaaa lookup_template_class_1 /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/pt.c:9324 0x9fcaaa lookup_template_class(tree_node*, tree_node*, tree_node*, tree_node*, int, int) /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/pt.c:9683 0xa29c8b finish_template_type(tree_node*, tree_node*, int) /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/semantics.c:3255 0x9a4aad cp_parser_template_id /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/parser.c:16406 0x9a4c66 cp_parser_class_name /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/parser.c:23127 0x9a8b42 cp_parser_qualifying_entity /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/parser.c:6683 0x9a8b42 cp_parser_nested_name_specifier_opt /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/parser.c:6369 0x99c9ab cp_parser_constructor_declarator_p /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/parser.c:27312 0x99c9ab cp_parser_decl_specifier_seq /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/parser.c:14038 0x99d2a4 cp_parser_simple_declaration /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/parser.c:13354 0x9c2c10 cp_parser_declaration /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/parser.c:13173 0x9c33a0 cp_parser_translation_unit /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/parser.c:4698 0x9c33a0 c_parse_file() /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/parser.c:41003 0xaccdab c_common_parse_file() /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/c-family/c-opts.c:1155
[Bug libstdc++/88947] regex_match doesn't fail early when given a non-matching pattern with a start-of-input anchor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88947 --- Comment #5 from Tim Shen --- (In reply to Tomalak Geret'kal from comment #4) > To be honest I'd expect this in less trivial circumstances too. If, at a > given stage of processing, the only possible paths towards a match all > require a prefix that's already been ruled out, that should be an immediate > return false. To the best of my knowledge this is commonly what happens in > regex engines (though again libstdc++ is far from alone in the C++ world in > not doing so!) For the original test case, have you tried regex_match() with "what.*"? Do you have any non-trivial testcase in mind that is still unexpectedly slow with regex_match()?
[Bug target/88909] struct builtin_description doesn't support ix86_isa_flags2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88909 --- Comment #1 from hjl at gcc dot gnu.org --- Author: hjl Date: Tue Jan 22 16:20:25 2019 New Revision: 268155 URL: https://gcc.gnu.org/viewcvs?rev=268155=gcc=rev Log: i386: Add mask2 to builtin_description There are struct builtin_description { const HOST_WIDE_INT mask; const enum insn_code icode; const char *const name; const enum ix86_builtins code; const enum rtx_code comparison; const int flag; }; Since "mask" is used for both ix86_isa_flags and ix86_isa_flags2, buitins with both flags can't be handled easily. This patch adds mask2 to builtin_description to handle it properly. 2019-01-22 Hongtao Liu H.J. Lu PR target/88909 * config/i386/i386-builtin.def: Add mask2 to all builtin initializations. Merge ARGS2 and SPECIAL_ARGS2 into ARGS and SPECIAL_ARGS. * config/i386/i386.c (BDESC): Add mask2 to the definition. (BDESC_FIRST): Likewise. (define_builtin): Add an argument for mask2. Updated to handle both ix86_isa_flags and ix86_isa_flags2. (define_builtin_const): Likewise. (define_builtin_pure): Likewise. (define_builtin2): Deleted. (define_builtin_const2): Likewise. (builtin_description): Add a member, mask2. (bdesc_*): Add mask2 to builtin initializations. (ix86_init_mmx_sse_builtins): Update calls to def_builtin, def_builtin_const and def_builtin_pure. Remove SPECIAL_ARGS2 support. (ix86_get_builtin_func_type): Remove SPECIAL_ARGS2 support. Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/i386-builtin.def trunk/gcc/config/i386/i386.c
[Bug target/87064] [9 regression] libgomp.oacc-fortran/reduction-3.f90 fails starting with r263751
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87064 --- Comment #22 from Bill Schmidt --- (I'll test with both disabled for LE and report results.)
[Bug target/87064] [9 regression] libgomp.oacc-fortran/reduction-3.f90 fails starting with r263751
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87064 --- Comment #21 from Bill Schmidt --- We should probably disable the _v4sf_scalar one for LE also, as this seems to be doing a similar trick for V4SF.
[Bug target/87064] [9 regression] libgomp.oacc-fortran/reduction-3.f90 fails starting with r263751
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87064 --- Comment #20 from Bill Schmidt --- Oh, sorry, I missed that in all the commentary. I had looked at the code and seen the "obvious" problem in the expansion, and noted you had suggested that also. Should have read further. I think that's right, using this is wrong for LE. Jakub, do you want to push that patch, or shall I regstrap it once more and take care of it?
[Bug ipa/88933] ICE: verify_cgraph_node failed (Error: caller edge count does not match BB count)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88933 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #6 from Jakub Jelinek --- gimple_merge_blocks is called in between, which merges a bb with that 1073741825 with one with 445388109 count and nothing updates the call edge count after that adjustment.
[Bug libstdc++/88740] [7/8 Regression] libstdc++ tests no longer print assertion failure messages
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88740 Jonathan Wakely changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2019-01-22 Summary|[7/8/9 Regression] |[7/8 Regression] libstdc++ |libstdc++ tests no longer |tests no longer print |print assertion failure |assertion failure messages |messages| Ever confirmed|0 |1 --- Comment #2 from Jonathan Wakely --- Fixed on trunk so far.
[Bug libstdc++/88740] [7/8/9 Regression] libstdc++ tests no longer print assertion failure messages
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88740 --- Comment #1 from Jonathan Wakely --- Author: redi Date: Tue Jan 22 16:08:18 2019 New Revision: 268154 URL: https://gcc.gnu.org/viewcvs?rev=268154=gcc=rev Log: PR libstdc++/88740 Print assertion messages to stderr PR libstdc++/88740 * testsuite/util/testsuite_hooks.h [stderr] (VERIFY): Use fprintf to write to stderr instead of using printf. Modified: trunk/libstdc++-v3/ChangeLog trunk/libstdc++-v3/testsuite/util/testsuite_hooks.h
[Bug target/88981] [nvptx, openacc, libgomp] How to handle async regions without corresponding wait
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88981 Tom de Vries changed: What|Removed |Added Keywords||openacc Target||nvptx CC||cltang at gcc dot gnu.org, ||tschwinge at gcc dot gnu.org --- Comment #1 from Tom de Vries --- Chung-Lin, how would this test-case be handled using the async patch set for gcc 10 stage 1? Is there something done in the generic openacc code? Thanks, - Tom
[Bug target/88981] New: [nvptx, openacc, libgomp] How to handle async regions without corresponding wait
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88981 Bug ID: 88981 Summary: [nvptx, openacc, libgomp] How to handle async regions without corresponding wait Product: gcc Version: 9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: vries at gcc dot gnu.org Target Milestone: --- Consider this test-case: ... /* { dg-do run } */ #include int main (void) { int a[128]; int N = 128; #pragma acc parallel async { #pragma loop seq for (int i = 0; i < 1024 * 1024 * 10; ++i) a[i % N] += a[N - (i % N) - 1]; } /* no #pragma acc wait */ return 0; } ... Atm the moment, we run into PR88941: ... async-no-wait.exe: libgomp/plugin/plugin-nvptx.c: map_fini: \ Assertion `!s->map->active' failed. ... Now, consider this patch: ... diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c index dd2bcf3083f..e9b0e6c660a 100644 --- a/libgomp/plugin/plugin-nvptx.c +++ b/libgomp/plugin/plugin-nvptx.c @@ -489,6 +489,14 @@ fini_streams_for_device (struct ptx_device *ptx_dev) struct ptx_stream *s = ptx_dev->active_streams; ptx_dev->active_streams = ptx_dev->active_streams->next; + { + CUresult r; + r = CUDA_CALL_NOCHECK (cuStreamQuery, s->stream); + if (r == CUDA_ERROR_NOT_READY) + GOMP_PLUGIN_error ("Stream destroyed with operation incomplete." +" Forgot to wait on async?"); + } + ret &= map_fini (s); CUresult r = CUDA_CALL_NOCHECK (cuStreamDestroy, s->stream); ... which gets us: ... libgomp: Stream destroyed with operation incomplete. Forgot to wait on async? async-no-wait.exe: libgomp/plugin/plugin-nvptx.c: map_fini: \ Assertion `!s->map->active' failed. ... So, the question is, how to handle async launches without corresponding wait? It might be good to notify the user about it, as above patch does (though perhaps not notify using GOMP_PLUGIN_error, but GOMP_PLUGIN_warning or some such). In the case that we call acc_shutdown, it's considered an error if a stream is still running, so we could not just notify, but error out.
[Bug libstdc++/86756] Don't define __cpp_lib_filesystem unless --enable-libstdcxx-filesystem-ts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86756 --- Comment #7 from Jonathan Wakely --- There are more changes needed to the library code, to stop using chdir, mkdir etc. when not supported. This was first brought up in https://gcc.gnu.org/ml/libstdc++/2019-01/msg00039.html I'm not sure how to detect whether those functions are usable though, because apparently they're declared during compilation but absent when loading libstdc++.so at runtime.
[Bug libstdc++/88971] Branch optimization inconsistency (missed optimization)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88971 --- Comment #8 from rguenther at suse dot de --- On Tue, 22 Jan 2019, maratrus at mail dot ru wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88971 > > --- Comment #7 from Marat Stanichenko --- > (In reply to rguent...@suse.de from comment #6) > > > Do you believe that compiler can do better in such situations? Or the > > > current > > > behaviour is perfectly valid and no improvements are really needed? > > > > The compiler can of course do better when estimating the benefit of > > inlining. It's just not entirely clear if it is reasonably easy to > > do so ... [let aside the -finline-function issue] > > Is it only about inlining? > > Unfortunately, I cannot evaluate the technical difficulties but in patterns > like presented here i.e. > > ``` > if (condition) > Function(param); > ``` > > irrespective of the fact whether `Function()` is asked to be inlined or not > there are at least two observations that the compiler may notice before taking > an optimization decision: > > a) Whether `Function(param)` is empty or not. In both scenarios in the example > presented `PrintBad("<", ">", t)` and `PrintGood("<", t)` are > actually empty. > > b) Whether `param` is used in a `Function`. I think I can see the examples > when > function calls like `PrintBad(">", t)` generates a code to construct > parameter ">" despite the fact that not only it is not used but also the > function body is empty. > > Of course, ideally I expect the parameter not to be constructed if it is not > used and the whole branch to be eliminated of the `Function` is empty and > `condition` does not have side effects. But as I said, I do not have enough > technical expertise to evaluate the cost. I speak from a solely user's > perspective. > > What do you think is the best way to solve this? Keep tracking examples until > some critical mass is collected? Well, it's more until any bright idea pops up how to solve this abstraction issue...
[Bug fortran/88980] New: [9 regression] segfault on allocatable string member assignment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88980 Bug ID: 88980 Summary: [9 regression] segfault on allocatable string member assignment Product: gcc Version: 9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: antony at cosmologist dot info Target Milestone: --- This code gives seg fault in 9.0.0 20181010 and 9.0.0 20190103, OK in 8.2.1: (same with P pointer or allocatable) program tester call gbug contains subroutine gbug type TNameValue character(LEN=:), allocatable :: Name end type TNameValue type TNameValue_pointer Type(TNameValue), allocatable :: P end type TNameValue_pointer Type TType type(TNameValue_pointer), dimension(:), allocatable :: Items end type TType Type(TType) T allocate(T%Items(2)) allocate(T%Items(2)%P) T%Items(2)%P%Name = 'test' end subroutine gbug end program tester
[Bug target/88877] rs6000 emits signed extension for unsigned int type(__floatunsidf).
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88877 --- Comment #18 from Segher Boessenkool --- (In reply to Alan Modra from comment #17) > > Is anything broken though? > > Yes, as demonstrated by the testcase. I couldn't get the testcase to link, I don't think I have an __floatunsidf anywhere, so I cannot check. > > If the libcall routines know they are called this way, all is fine. > > They don't. libgcc functions are mostly C code that can make use of the > fact that on ppc64 an unsigned int arg will have the top 32 bits zeroed. And since they are called without prototype, they should be defined without prototype as well. Why don't we do that? Or is that no longer supported; in that case, the definition should declare types as they will be passed manually. Ugly and fragile. > We avoid some potential problems with things like popcount by not having a > popcountsi. Instead we use popcountdi, and that results in gcc > zero-extending a 32-bit value to 64 bits before we reach > emit_library_call_value_1. Wow, ugly. And extra fragile :-(
[Bug libstdc++/83906] Random FAIL: libstdc++-prettyprinters/80276.cc whatis p4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83906 --- Comment #23 from Jonathan Wakely --- Actually, I wonder if it's caused by r264958 because 'std::string' is no longer unambiguous in libstdc++.so In some translation units it is a typedef for std::basic_string and in other translation units it is a typedef for std::__cxx11::basic_string.
[Bug libstdc++/88947] regex_match doesn't fail early when given a non-matching pattern with a start-of-input anchor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88947 --- Comment #4 from Tomalak Geret'kal --- To be honest I'd expect this in less trivial circumstances too. If, at a given stage of processing, the only possible paths towards a match all require a prefix that's already been ruled out, that should be an immediate return false. To the best of my knowledge this is commonly what happens in regex engines (though again libstdc++ is far from alone in the C++ world in not doing so!)
[Bug middle-end/88968] [8/9 Regression] Stack overflow in gimplify_expr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88968 Jakub Jelinek changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org --- Comment #4 from Jakub Jelinek --- Created attachment 45495 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45495=edit gcc9-pr88968.patch Untested fix.
[Bug target/88954] __attribute__((noplt)) doesn't work with function pointers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88954 --- Comment #6 from hjl at gcc dot gnu.org --- Author: hjl Date: Tue Jan 22 14:53:41 2019 New Revision: 268152 URL: https://gcc.gnu.org/viewcvs?rev=268152=gcc=rev Log: i386: Load external function address via GOT slot With noplt attribute, we load the external function address via the GOT slot so that linker won't create an PLT entry for extern function address. gcc/ PR target/88954 * config/i386/i386.c (ix86_force_load_from_GOT_p): Also check noplt attribute. gcc/testsuite/ PR target/88954 * gcc.target/i386/pr88954-1.c: New test. * gcc.target/i386/pr88954-2.c: Likewise. Added: trunk/gcc/testsuite/gcc.target/i386/pr88954-1.c trunk/gcc/testsuite/gcc.target/i386/pr88954-2.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/i386.c trunk/gcc/testsuite/ChangeLog
[Bug libstdc++/88971] Branch optimization inconsistency (missed optimization)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88971 --- Comment #7 from Marat Stanichenko --- (In reply to rguent...@suse.de from comment #6) > > Do you believe that compiler can do better in such situations? Or the > > current > > behaviour is perfectly valid and no improvements are really needed? > > The compiler can of course do better when estimating the benefit of > inlining. It's just not entirely clear if it is reasonably easy to > do so ... [let aside the -finline-function issue] Is it only about inlining? Unfortunately, I cannot evaluate the technical difficulties but in patterns like presented here i.e. ``` if (condition) Function(param); ``` irrespective of the fact whether `Function()` is asked to be inlined or not there are at least two observations that the compiler may notice before taking an optimization decision: a) Whether `Function(param)` is empty or not. In both scenarios in the example presented `PrintBad("<", ">", t)` and `PrintGood("<", t)` are actually empty. b) Whether `param` is used in a `Function`. I think I can see the examples when function calls like `PrintBad(">", t)` generates a code to construct parameter ">" despite the fact that not only it is not used but also the function body is empty. Of course, ideally I expect the parameter not to be constructed if it is not used and the whole branch to be eliminated of the `Function` is empty and `condition` does not have side effects. But as I said, I do not have enough technical expertise to evaluate the cost. I speak from a solely user's perspective. What do you think is the best way to solve this? Keep tracking examples until some critical mass is collected?
[Bug libstdc++/83906] Random FAIL: libstdc++-prettyprinters/80276.cc whatis p4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83906 --- Comment #22 from Jonathan Wakely --- Pedro, I'm seeing this again with GDB 8.2 (specifically gdb-8.2-6.fc29.x86_64). Is it likely to be something different, or a GDB regression? (I still want a libstdc++ fix that works for older GDB anyway).
[Bug c/88955] transparent_union for vector types not accepted
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88955 --- Comment #3 from Alexander Monakov --- Note, without the attribute gcc passes the union on an SSE register, so it doesn't look like TImode on the union matters (otherwise it would be passed via rdx:rax register pair): typedef unsigned long u64x2 __attribute__ ((vector_size (16))); typedef union { u64x2 u64; } v128; v128 bar(v128 x); v128 foo(v128 x) { x.u64 *= -1; return bar(x); } foo: vpxor %xmm1, %xmm1, %xmm1 vpsubq %xmm0, %xmm1, %xmm0 jmp bar
[Bug c++/88979] New: [C++20] P0634R3 not working for constructor parameter types
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88979 Bug ID: 88979 Summary: [C++20] P0634R3 not working for constructor parameter types Product: gcc Version: 9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: 19Sebastian95 at gmx dot de Target Milestone: --- # gcc -v Using built-in specs. COLLECT_GCC=/opt/bin/gcc Target: x86_64-pc-linux-gnu Configured with: ../gcc/configure --prefix=/opt/ --enable-languages=c,c++ Thread model: posix gcc version 9.0.0 20190118 (experimental) (GCC) # Description: When compiling the following code and uncommenting the first constructor of A it'll throw the error in the comment. The expected behaviour would either be "error: need 'typename' before 'T::type' because 'T' is a dependent scope" or no error at all. # Options: -O2 -std=c++2a -Wall -Wextra # Source Code: template class A { public: using type = T::type; /*A(T::type a) : mA{a} {}*/ // error: expected ')' before 'a' A(type a); // OK constexpr void a(T::type a) noexcept { // OK mA = a; } [[nodiscard]] constexpr T::type a() const noexcept { // OK return mA; } private: T::type mA; // OK }; template A::A(T::type a) : mA{a} {} // OK struct B { using type = int; }; int main() { A a{20}; a.a(10); return a.a(); }
[Bug tree-optimization/88978] Failed outer loop vectorization with grouped accesses
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88978 Richard Biener changed: What|Removed |Added Keywords||missed-optimization Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2019-01-22 Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Richard Biener --- Mine.
[Bug tree-optimization/88978] New: Failed outer loop vectorization with grouped accesses
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88978 Bug ID: 88978 Summary: Failed outer loop vectorization with grouped accesses Product: gcc Version: 9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org Target Milestone: --- We fail to vectorize outer loops when there are grouped accesses in the inner loop: int a[1024]; int b[1024][1024]; void foo () { for (int i = 0; i < 512; ++i) { int a1 = a[2*i]; int a2 = a[2*i+1]; for (int j = 0; j < 1024; ++j) { b[j][2*i] = a1; b[j][2*i+1] = a2; } } } This is mostly because we cannot do SLP here (for implementation reasons). We are vectorizing the following just fine, applying interleaving to the outer loop accesses: int a[1024]; int b[1024][1024]; void foo () { for (int i = 0; i < 512; ++i) { int a1 = a[2*i]; int a2 = a[2*i+1]; for (int j = 0; j < 1024; ++j) b[j][i] = a1+a2; } } The guard in question is the following which is premature (before SLP would be even tried) and somewhat inaccurate since it is grouped accesses in the inner loop when doing outer loop vectorization rather than grouped accesses in an outer loop that fail. static bool vect_analyze_data_ref_access (dr_vec_info *dr_info) { ... if (loop && nested_in_vect_loop_p (loop, stmt_info)) { if (dump_enabled_p ()) dump_printf_loc (MSG_NOTE, vect_location, "grouped access in outer loop.\n"); return false; }
[Bug tree-optimization/88240] Potential optimization bug: invalid pre-load of floating-point value could cause SIGFPE-underflow if value is integer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88240 Richard Biener changed: What|Removed |Added Keywords||wrong-code Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2019-01-22 Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #12 from Richard Biener --- Yes, IMHO the bug is valid. I've ment to assign myself (though don't hold your breath for a fix).
[Bug tree-optimization/88919] New test case gcc.dg/vect/pr88903-1.c in r268076 fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88919 --- Comment #4 from rguenther at suse dot de --- On Tue, 22 Jan 2019, clyon at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88919 > > Christophe Lyon changed: > >What|Removed |Added > > CC||clyon at gcc dot gnu.org > > --- Comment #3 from Christophe Lyon --- > (In reply to Richard Biener from comment #2) > > Sandra posted a patch that will probably fix this (out-of-bound shift > > values). > > Do you mean https://gcc.gnu.org/ml/gcc-patches/2019-01/msg01207.html ? Yes.
[Bug tree-optimization/88398] vectorization failure for a small loop to do byte comparison
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88398 --- Comment #22 from rguenther at suse dot de --- On Tue, 22 Jan 2019, ktkachov at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88398 > > --- Comment #21 from ktkachov at gcc dot gnu.org --- > So the actual hot loop in xz_r does: > typedef unsigned char __uint8_t; > typedef unsigned int __uint32_t; > typedef unsigned long long __uint64_t; > > int > foo (const __uint64_t len_limit, const __uint8_t *cur, > __uint32_t delta, int len) { > > const __uint8_t *pb = cur - delta; > > while (++len != len_limit) { > if (pb[len] != cur[len]) > break; > } > > return len; > } > > The 'pb' pointer is the 'cur' pointer but moved back by 'delta'. > Presumably that means that all memory between 'pb' and 'delta' and could be > read in as wide a load as possible? A C language lawyer would agree with that. But does it really help? The loop also accesses [cur + len, cur + len_limit].
[Bug target/88469] [7/8 regression] AAPCS - Struct with 64-bit bitfield may be passed in wrong registers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88469 Richard Earnshaw changed: What|Removed |Added Summary|[7/8/9 regression] AAPCS - |[7/8 regression] AAPCS - |Struct with 64-bit bitfield |Struct with 64-bit bitfield |may be passed in wrong |may be passed in wrong |registers |registers --- Comment #7 from Richard Earnshaw --- Fixed on trunk. Still need mitigation for gcc-7/8 and to deal with boostrapping gcc-9 with gcc-6/7/8.
[Bug libstdc++/88971] Branch optimization inconsistency (missed optimization)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88971 --- Comment #6 from rguenther at suse dot de --- On Tue, 22 Jan 2019, maratrus at mail dot ru wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88971 > > --- Comment #5 from Marat Stanichenko --- > > (In reply to Richard Biener from comment #1) > > This is because it still needs to generate the std::string objects at the > > caller > > Thank you very much for the comment! That is probably why I had to add another > string parameter to the function `PrintBad` in the example provided to trigger > this behaviour. The fact is that in the production environment where I managed > to spot this, the compiler could not optimize the function that is very > similar > to `PrintGood` and has just a single string parameter. I didn't manage to > reproduce it when simplifying things while preparing a test-case here. > > Do you believe that compiler can do better in such situations? Or the current > behaviour is perfectly valid and no improvements are really needed? The compiler can of course do better when estimating the benefit of inlining. It's just not entirely clear if it is reasonably easy to do so ... [let aside the -finline-function issue]
[Bug ipa/88936] [7/8/9 Regression] -fipa-pta breaks bash (incorrect optimisation of recursive static function)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88936 --- Comment #10 from Richard Biener --- So the idea for the fix is to make locals that escape through recursive edges behave as if they were really *ptr with ptr pointing to , where localp would be "the other locals". This could be done on the constraint level. Semantically equivalent is doing the above by post-processing the points-to sets after propagation and replacing 'local' with 'local + localp'. We'd need to gather a bitmap of candidate UIDs for this which we could eventually prune by the set of vars that do not escape through such an edge (implementation is not entirely clear). What is missing right now is a conservative predicate telling us whether defined function X is reachable recursively. Honza?
[Bug tree-optimization/88398] vectorization failure for a small loop to do byte comparison
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88398 --- Comment #21 from ktkachov at gcc dot gnu.org --- So the actual hot loop in xz_r does: typedef unsigned char __uint8_t; typedef unsigned int __uint32_t; typedef unsigned long long __uint64_t; int foo (const __uint64_t len_limit, const __uint8_t *cur, __uint32_t delta, int len) { const __uint8_t *pb = cur - delta; while (++len != len_limit) { if (pb[len] != cur[len]) break; } return len; } The 'pb' pointer is the 'cur' pointer but moved back by 'delta'. Presumably that means that all memory between 'pb' and 'delta' and could be read in as wide a load as possible?
[Bug target/88469] [7/8/9 regression] AAPCS - Struct with 64-bit bitfield may be passed in wrong registers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88469 --- Comment #6 from Richard Earnshaw --- Author: rearnsha Date: Tue Jan 22 14:03:22 2019 New Revision: 268151 URL: https://gcc.gnu.org/viewcvs?rev=268151=gcc=rev Log: [arm] PR target/88469 fix incorrect argument passing with 64-bit bitfields Unfortunately another PCS bug has come to light with the layout of structs whose alignment is dominated by a 64-bit bitfield element. Such fields in the type list appear to have alignment 1, but in reality, for the purposes of alignment of the underlying structure, the alignment is derived from the underlying bitfield's type. We've been getting this wrong since support for over-aligned record types was added several releases back. Worse still, the existing code may generate unaligned memory accesses that may fault on some versions of the architecture. I've taken the opportunity to add a few more tests that check the passing arguments with overalignment in the PCS. Looking through the existing tests it looked like they were really only checking self-consistency and not the precise location of the arguments. PR target/88469 gcc: * config/arm/arm.c (arm_needs_doubleword_align): Return 2 if a record's alignment is dominated by a bitfield with 64-bit aligned base type. (arm_function_arg): Emit a warning if the alignment has changed since earlier GCC releases. (arm_function_arg_boundary): Likewise. (arm_setup_incoming_varargs): Likewise. gcc/testsuite: * gcc.target/arm/aapcs/bitfield1.c: New test. * gcc.target/arm/aapcs/overalign_rec1.c: New test. * gcc.target/arm/aapcs/overalign_rec2.c: New test. * gcc.target/arm/aapcs/overalign_rec3.c: New test. Added: trunk/gcc/testsuite/gcc.target/arm/aapcs/bitfield1.c trunk/gcc/testsuite/gcc.target/arm/aapcs/overalign_rec1.c trunk/gcc/testsuite/gcc.target/arm/aapcs/overalign_rec2.c trunk/gcc/testsuite/gcc.target/arm/aapcs/overalign_rec3.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/arm/arm.c trunk/gcc/testsuite/ChangeLog
[Bug target/88965] powerpc64le vector builtin hits ICE in verify_gimple
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88965 --- Comment #5 from Anton Blanchard --- Martin: "gcc -c x.c" was enough to hit it on a build of trunk on my POWER9 ppc64le box. Jakub: Thanks, that fixes it for me.
[Bug tree-optimization/88919] New test case gcc.dg/vect/pr88903-1.c in r268076 fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88919 Christophe Lyon changed: What|Removed |Added CC||clyon at gcc dot gnu.org --- Comment #3 from Christophe Lyon --- (In reply to Richard Biener from comment #2) > Sandra posted a patch that will probably fix this (out-of-bound shift > values). Do you mean https://gcc.gnu.org/ml/gcc-patches/2019-01/msg01207.html ?
[Bug tree-optimization/88044] [9 regression] gfortran.dg/transfer_intrinsic_3.f90 hangs after r266171
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88044 --- Comment #17 from Christophe Lyon --- (In reply to Jakub Jelinek from comment #16) > Fixed. I confirm the problem I mentioned in #c3 is now fixed. Thanks!
[Bug tree-optimization/86214] [8 Regression] Strongly increased stack usage
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86214 Christophe Lyon changed: What|Removed |Added CC||clyon at gcc dot gnu.org --- Comment #20 from Christophe Lyon --- (In reply to Jakub Jelinek from comment #17) > Author: jakub > Date: Fri Jan 18 10:07:27 2019 > New Revision: 268067 > > URL: https://gcc.gnu.org/viewcvs?rev=268067=gcc=rev > Log: > PR tree-optimization/86214 > * tree-inline.h (struct copy_body_data): Add > add_clobbers_to_eh_landing_pads member. > * tree-inline.c (add_clobbers_to_eh_landing_pad): New function. > (copy_edges_for_bb): Call it if EH edge destination is < > id->add_clobbers_to_eh_landing_pads. Fix a comment typo. > (expand_call_inline): Set id->add_clobbers_to_eh_landing_pads > if flag_stack_reuse != SR_NONE and clear it afterwards. > > * g++.dg/opt/pr86214-1.C: New test. > * g++.dg/opt/pr86214-2.C: New test. > > Added: > trunk/gcc/testsuite/g++.dg/opt/pr86214-1.C > trunk/gcc/testsuite/g++.dg/opt/pr86214-2.C > Modified: > trunk/gcc/ChangeLog > trunk/gcc/testsuite/ChangeLog > trunk/gcc/tree-inline.c > trunk/gcc/tree-inline.h Hi Jakub, Since you committed this patch, I've noticed regressions on some arm targets: FAIL: 23_containers/list/requirements/exception/basic.cc execution test libstdc++.log has this: spawn [open ...] N10__gnu_test12functor_base19iterator_operationsINSt7__cxx114listIN9__gnu_cxx17throw_value_limitENS4_21throw_allocator_limitIS5_EE end count 2 N10__gnu_test12functor_base25const_iterator_operationsINSt7__cxx114listIN9__gnu_cxx17throw_value_limitENS4_21throw_allocator_limitIS5_EE end count 3 N10__gnu_test12functor_base11erase_pointINSt7__cxx114listIN9__gnu_cxx17throw_value_limitENS4_21throw_allocator_limitIS5_Lb1ELb0EEE end count 4 N10__gnu_test12functor_base11erase_rangeINSt7__cxx114listIN9__gnu_cxx17throw_value_limitENS4_21throw_allocator_limitIS5_Lb1ELb0EEE end count 5 N10__gnu_test12functor_base12insert_pointINSt7__cxx114listIN9__gnu_cxx17throw_value_limitENS4_21throw_allocator_limitIS5_Lb1ELb0EEE end count 6 N10__gnu_test12functor_base7emplaceINSt7__cxx114listIN9__gnu_cxx17throw_value_limitENS4_21throw_allocator_limitIS5_Lb0EEE end count 7 N10__gnu_test12functor_base13emplace_pointINSt7__cxx114listIN9__gnu_cxx17throw_value_limitENS4_21throw_allocator_limitIS5_Lb1ELb0ELb0EEE end count 8 N10__gnu_test12functor_base13emplace_frontINSt7__cxx114listIN9__gnu_cxx17throw_value_limitENS4_21throw_allocator_limitIS5_Lb1EEE end count 9 N10__gnu_test12functor_base12emplace_backINSt7__cxx114listIN9__gnu_cxx17throw_value_limitENS4_21throw_allocator_limitIS5_Lb1EEE end count 10 N10__gnu_test12functor_base9pop_frontINSt7__cxx114listIN9__gnu_cxx17throw_value_limitENS4_21throw_allocator_limitIS5_Lb1EEE end count 11 N10__gnu_test12functor_base8pop_backINSt7__cxx114listIN9__gnu_cxx17throw_value_limitENS4_21throw_allocator_limitIS5_Lb1EEE end count 12 N10__gnu_test12functor_base10push_frontINSt7__cxx114listIN9__gnu_cxx17throw_value_limitENS4_21throw_allocator_limitIS5_Lb1EEE end count 13 N10__gnu_test12functor_base9push_backINSt7__cxx114listIN9__gnu_cxx17throw_value_limitENS4_21throw_allocator_limitIS5_Lb1EEE end count 14 N10__gnu_test12functor_base6rehashINSt7__cxx114listIN9__gnu_cxx17throw_value_limitENS4_21throw_allocator_limitIS5_Lb0EEE end count 15 N10__gnu_test12functor_base4swapINSt7__cxx114listIN9__gnu_cxx17throw_value_limitENS4_21throw_allocator_limitIS5_EE end count 16 qemu: uncaught target signal 11 (Segmentation fault) - core dumped The qemu trace does not seem very helpful (and is probably incomplete): IN: _ZN10__gnu_test12basic_safetyINSt7__cxx114listIN9__gnu_cxx17throw_value_limitENS3_21throw_allocator_limitIS4_E3runEv 0x0001c724: e5943008 ldr r3, [r4, #8] 0x0001c728: e59d2010 ldr r2, [sp, #0x10] 0x0001c72c: e59d101c ldr r1, [sp, #0x1c] 0x0001c730: e353 cmp r3, #0 0x0001c734: e5987000 ldr r7, [r8] 0x0001c738: e5812000 str r2, [r1] 0x0001c73c: e5896000 str r6, [sb] 0x0001c740: e5882000 str r2, [r8] 0x0001c744: 0a000510 beq #0x1db8c IN: 0x40adc6bc: e0849009 add sb, r4, sb 0x40adc6c0: e59f29c8 ldr r2, [pc, #0x9c8] 0x40adc6c4: e5993004 ldr r3, [sb, #4] 0x40adc6c8: e08f2002 add r2, pc, r2 0x40adc6cc: e3833001 orr r3, r3, #1 0x40adc6d0: e1550002 cmp r5, r2 0x40adc6d4: e59da018 ldr sl, [sp, #0x18] 0x40adc6d8: e5893004 str r3, [sb, #4] 0x40adc6dc: 0a02 beq #0x40adc6ec IN: 0x40adc6ec: e59f39a0 ldr r3, [pc, #0x9a0] 0x40adc6f0: e2846008 add r6, r4, #8 0x40adc6f4: e08f3003 add r3, pc, r3 0x40adc6f8: e593102c ldr r1, [r3, #0x2c] 0x40adc6fc: e351 cmp r1, #0
[Bug rtl-optimization/88948] [9 Regression] ICE in elimination_costs_in_insn, at reload1.c:3640 since r264148
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88948 --- Comment #3 from Uroš Bizjak --- Created attachment 45494 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45494=edit Proposed patch
[Bug tree-optimization/88240] Potential optimization bug: invalid pre-load of floating-point value could cause SIGFPE-underflow if value is integer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88240 --- Comment #11 from Thomas De Schampheleire --- Any feedback? With the reduced testcase qemu is out of the picture. Do you agree that this is a bug in gcc?
[Bug c++/88977] New: __builtin_is_constant_evaluated() as function template argument causes substitution failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88977 Bug ID: 88977 Summary: __builtin_is_constant_evaluated() as function template argument causes substitution failure Product: gcc Version: unknown Status: UNCONFIRMED Keywords: rejects-valid Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: ensadc at mailnesia dot com Target Milestone: --- https://wandbox.org/permlink/LYuJZ4w7YLgidqdi template int f(); int x = f<__builtin_is_constant_evaluated()>(); prog.cc:3:46: error: no matching function for call to 'f<__builtin_is_constant_evaluated()>()' 3 | int x = f<__builtin_is_constant_evaluated()>(); | ^ prog.cc:1:20: note: candidate: 'template > int f()' 1 | template int f(); |^ prog.cc:1:20: note: template argument deduction/substitution failed: It works fine with class/variable/alias templates.
[Bug libstdc++/88971] Branch optimization inconsistency (missed optimization)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88971 --- Comment #5 from Marat Stanichenko --- (In reply to Richard Biener from comment #1) > This is because it still needs to generate the std::string objects at the > caller Thank you very much for the comment! That is probably why I had to add another string parameter to the function `PrintBad` in the example provided to trigger this behaviour. The fact is that in the production environment where I managed to spot this, the compiler could not optimize the function that is very similar to `PrintGood` and has just a single string parameter. I didn't manage to reproduce it when simplifying things while preparing a test-case here. Do you believe that compiler can do better in such situations? Or the current behaviour is perfectly valid and no improvements are really needed?
[Bug ipa/88936] [7/8/9 Regression] -fipa-pta breaks bash (incorrect optimisation of recursive static function)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88936 --- Comment #9 from Richard Biener --- (In reply to Jan Hubicka from comment #7) > Hi, > there is ipa_reduced_postorder that will compute SCCs and store scc > index. OK, from looking at examples it seems that I can access node->aux as ipa_dfs_info * after ipa_reduced_postorder and before I call ipa_free_postorder_info. To check whether a call is possibly recursing to the caller I'd then check whether the callers and the callees DFS number match. That works as far as direct calls are considered - but what about indirect calls? Not that I'm sure what to do when hitting a possibly recursive call - dropping all the way to pt_anything for arguments would be a bit harsh. But whether ensuring that we do not end up with a singleton composed of caller automatic vars is enough I'm not sure...
[Bug c++/88976] New: ICE in fold_convert_loc, at fold-const.c:2552
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88976 Bug ID: 88976 Summary: ICE in fold_convert_loc, at fold-const.c:2552 Product: gcc Version: 4.9.0 Status: UNCONFIRMED Keywords: ice-on-valid-code, openmp Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: asolokha at gmx dot com Target Milestone: --- g++-9.0.0-alpha20190120 snapshot (r268107), 8.2, 7.4, 6.3, 5.5, 4.9.4 all ICE when compiling the following snippet w/ -fopenmp: template void jm (T cv) { #pragma omp cancel parallel if (cv) } % g++-9.0.0-alpha20190120 -fopenmp -c icjm7wqb.cpp icjm7wqb.cpp: In function 'void jm(T)': icjm7wqb.cpp:4:36: internal compiler error: in fold_convert_loc, at fold-const.c:2552 4 | #pragma omp cancel parallel if (cv) |^ 0x6d8408 fold_convert_loc(unsigned int, tree_node*, tree_node*) /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/fold-const.c:2552 0xa307b1 finish_omp_cancel(tree_node*) /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/semantics.c:9060 0x997e65 cp_parser_omp_cancel /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/parser.c:37755 0x997e65 cp_parser_pragma /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/parser.c:40735 0x9a00ec cp_parser_statement /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/parser.c:11204 0x9a0c38 cp_parser_statement_seq_opt /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/parser.c:11592 0x9a0d18 cp_parser_compound_statement /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/parser.c:11546 0x9bab16 cp_parser_function_body /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/parser.c:22530 0x9bab16 cp_parser_ctor_initializer_opt_and_function_body /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/parser.c:22567 0x9bb3f0 cp_parser_function_definition_after_declarator /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/parser.c:27630 0x9bc1d4 cp_parser_function_definition_from_specifiers_and_declarator /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/parser.c:27546 0x9bc1d4 cp_parser_init_declarator /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/parser.c:20205 0x9bf7a4 cp_parser_single_declaration /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/parser.c:28096 0x9bf90d cp_parser_template_declaration_after_parameters /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/parser.c:27688 0x9c026e cp_parser_explicit_template_declaration /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/parser.c:27934 0x9c026e cp_parser_template_declaration_after_export /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/parser.c:27953 0x9c2e09 cp_parser_declaration /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/parser.c:13122 0x9c346e cp_parser_translation_unit /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/parser.c:4698 0x9c346e c_parse_file() /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cp/parser.c:41003 0xacce0b c_common_parse_file() /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/c-family/c-opts.c:1155
[Bug tree-optimization/88975] New: ICE: Segmentation fault (in verify_ssa or gimple_code)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88975 Bug ID: 88975 Summary: ICE: Segmentation fault (in verify_ssa or gimple_code) Product: gcc Version: 9.0 Status: UNCONFIRMED Keywords: ice-on-valid-code, openmp Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: asolokha at gmx dot com Target Milestone: --- 1. gcc-9.0.0-alpha20190120 snapshot (r268107) ICEs when compiling the following snippet w/ -fopenmp -fchecking: void mr (int gy) { int ij[gy]; int sk; #pragma omp taskloop reduction(+:ij) for (sk = 0; sk < 1; ++sk) { } } % gcc-9.0.0-alpha20190120 -fopenmp -c xsihfn9u.c during GIMPLE pass: ssa xsihfn9u.c: In function 'mr': xsihfn9u.c:11:1: internal compiler error: Segmentation fault 11 | } | ^ 0xd701df crash_signal /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/toplev.c:326 0xf923dc verify_ssa(bool, bool) /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/tree-ssa.c:1050 0xc89d0d execute_function_todo /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/passes.c:1984 0xc8ab0e execute_todo /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/passes.c:2031 2. Compiling the above snippet w/ -fopenmp -fno-checking yields the following instead: % gcc-9.0.0-alpha20190120 -fopenmp -fno-checking -c xsihfn9u.c during RTL pass: expand xsihfn9u.c: In function 'mr': xsihfn9u.c:2:1: internal compiler error: Segmentation fault 2 | mr (int gy) | ^~ 0xd701df crash_signal /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/toplev.c:326 0xf8db84 gimple_code /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/gimple.h:1689 0xf8db84 gimple_nop_p /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/gimple.h:6466 0xf8db84 ssa_undefined_value_p(tree_node*, bool) /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/tree-ssa.c:1295 0xf8db84 ssa_undefined_value_p(tree_node*, bool) /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/tree-ssa.c:1286 0xe218df get_undefined_value_partitions /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/tree-outof-ssa.c:978 0xe218df remove_ssa_form /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/tree-outof-ssa.c:1072 0xe218df rewrite_out_of_ssa(ssaexpand*) /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/tree-outof-ssa.c:1306 0x912660 execute /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/cfgexpand.c:6314
[Bug preprocessor/88974] New: [9 Regression] ICE: Segmentation fault (in linemap_resolve_location)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88974 Bug ID: 88974 Summary: [9 Regression] ICE: Segmentation fault (in linemap_resolve_location) Product: gcc Version: 9.0 Status: UNCONFIRMED Keywords: error-recovery, ice-on-invalid-code Severity: normal Priority: P3 Component: preprocessor Assignee: unassigned at gcc dot gnu.org Reporter: asolokha at gmx dot com Target Milestone: --- gcc-9.0.0-alpha20190120 snapshot (r268107) ICEs when compiling the following snippet derived from test/Frontend/rewrite-includes-invalid-hasinclude.c from the clang 7.0.1 testsuite: #if __has_include__ ( character 1 | #if __has_include__ (
[Bug middle-end/88968] [8/9 Regression] Stack overflow in gimplify_expr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88968 --- Comment #3 from Jakub Jelinek --- The problem is that for these packed structs the DECL_BIT_FIELD_REPRESENTATIVE is not integral FIELD_DECL that the c-omp.c code assumes. BIT_FIELD_REF seems to work with non-integral base types from which the field is extracted: /* Reference to a group of bits within an object. Similar to COMPONENT_REF except the position is given explicitly rather than via a FIELD_DECL. Operand 0 is the structure or union expression; operand 1 is a tree giving the constant number of bits being referenced; operand 2 is a tree giving the constant position of the first referenced bit. The result type width has to match the number of bits referenced. If the result type is integral, its signedness specifies how it is extended to its mode width. */ DEFTREECODE (BIT_FIELD_REF, "bit_field_ref", tcc_reference, 3) but we need to insert the field back and for that the BIT_INSERT_EXPR we are using requires that it stores into an integral or vector type expression: /* Given a container value, a replacement value and a bit position within the container, produce the value that results from replacing the part of the container starting at the bit position with the replacement value. Operand 0 is a tree for the container value of integral or vector type; Operand 1 is a tree for the replacement value of another integral or the vector element type; Operand 2 is a tree giving the constant bit position; The number of bits replaced is given by the precision of the type of the replacement value if it is integral or by its size if it is non-integral. ??? The reason to make the size of the replacement implicit is to avoid introducing a quaternary operation. The replaced bits shall be fully inside the container. If the container is of vector type, then these bits shall be aligned with its elements. */ DEFTREECODE (BIT_INSERT_EXPR, "bit_insert_expr", tcc_expression, 3)
[Bug libstdc++/88971] Branch optimization inconsistency (missed optimization)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88971 Richard Biener changed: What|Removed |Added CC||hubicka at gcc dot gnu.org --- Comment #4 from Richard Biener --- (In reply to Jonathan Wakely from comment #3) > (In reply to Richard Biener from comment #1) > > This is because it still needs to generate the std::string objects at the > > caller > > site (outside of the if (print)). This involves quite some code to get > > rid of, and even at -O3 we do not inline basic_string::basic_string it seems > > (ISTR that is out-of-line in the library): > > > > __asm__ __volatile__("mfence" : : : "memory"); > > _6 = MEM[(const int *) + 4B]; > > if (_6 > 0) > > goto ; [41.48%] > > else > > goto ; [58.52%] > > > >[local count: 445388109]: > > std::basic_string::basic_string (, "<", ); > > _7 = MEM[(char * *)]; > > _8 = _7 + 18446744073709551592; > > if (_8 != &_S_empty_rep_storage) > > goto ; [10.00%] > > else > > goto ; [90.00%] > > Looks like you're using -D_GLIBCXX_USE_CXX11_ABI=0 but the OP is not. Indeed. It's still missed inlining that makes elding of the argument construction difficult. Looks like std::__cxx11::basic_string::_M_construct is not marked inline (so needs -finline-functions to get IPA inlining processing). Indeed I see // For forward_iterators up to random_access_iterators, used for // string::iterator, _CharT*, etc. template void _M_construct(_FwdIterator __beg, _FwdIterator __end, std::forward_iterator_tag); and others.
[Bug middle-end/88968] [8/9 Regression] Stack overflow in gimplify_expr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88968 Arseny Solokha changed: What|Removed |Added Component|c |middle-end --- Comment #2 from Arseny Solokha --- struct { unsigned int hq : 16; unsigned int dv : 1; } __attribute__ ((__packed__)) e2; int yp (void) { int sr; #pragma omp atomic capture { sr = e2.hq; e2.hq = 0; } return sr; } % gcc-9.0.0-alpha20190120 -fopenmp -c zto53g7w.c zto53g7w.c: In function 'yp': zto53g7w.c:15:3: internal compiler error: in fold_convert_loc, at fold-const.c:2552 15 | } | ^ 0x615112 fold_convert_loc(unsigned int, tree_node*, tree_node*) /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/fold-const.c:2552 0xa741b8 omit_one_operand_loc(unsigned int, tree_node*, tree_node*, tree_node*) /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/fold-const.c:3769 0x8918ad c_finish_omp_atomic(unsigned int, tree_code, tree_code, tree_node*, tree_node*, tree_node*, tree_node*, tree_node*, bool, omp_memory_order, bool) /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/c-family/c-omp.c:412 0x834307 c_parser_omp_atomic /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/c/c-parser.c:16470 0x843aea c_parser_omp_construct /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/c/c-parser.c:19490 0x821667 c_parser_pragma /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/c/c-parser.c:11562 0x83b2b4 c_parser_compound_statement_nostart /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/c/c-parser.c:5114 0x83b8b8 c_parser_compound_statement /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/c/c-parser.c:4980 0x83d1b5 c_parser_declaration_or_fndef /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/c/c-parser.c:2352 0x84456f c_parser_external_declaration /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/c/c-parser.c:1653 0x844fb1 c_parser_translation_unit /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/c/c-parser.c:1534 0x844fb1 c_parse_file() /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/c/c-parser.c:19840 0x898bcb c_common_parse_file() /var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190120/work/gcc-9-20190120/gcc/c-family/c-opts.c:1155
[Bug lto/51765] [9 Regression] Testsuite ICEs with -flto
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51765 Arseny Solokha changed: What|Removed |Added CC||asolokha at gmx dot com --- Comment #9 from Arseny Solokha --- (In reply to Jan Hubicka from comment #8) > during IPA pass: fnsummary > /aux/hubicka/trunk4/gcc/testsuite/g++.dg/ext/vector33.C:10:1: internal > compiler error: tree code 'template_parm_index' is not supported in LTO > streams This is PR83997, which already has some problem analysis by Jakub.
[Bug target/88952] The asm operator modifiers for rs6000 should be documented like they are for x86
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88952 --- Comment #11 from Uroš Bizjak --- (In reply to Christopher Leonard from comment #10) > Getting contradictory statements now: > >reg:reg+1 maps to lo:hi on x86. > >On x86, we don't allow register pairs in asm at all. > > Not allowing, or printing a warning, is much better behavior than what I > have been getting on PPC. Ah, sorry - x86 emits a warning.
[Bug libstdc++/88971] Branch optimization inconsistency (missed optimization)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88971 --- Comment #3 from Jonathan Wakely --- (In reply to Richard Biener from comment #1) > This is because it still needs to generate the std::string objects at the > caller > site (outside of the if (print)). This involves quite some code to get > rid of, and even at -O3 we do not inline basic_string::basic_string it seems > (ISTR that is out-of-line in the library): > > __asm__ __volatile__("mfence" : : : "memory"); > _6 = MEM[(const int *) + 4B]; > if (_6 > 0) > goto ; [41.48%] > else > goto ; [58.52%] > >[local count: 445388109]: > std::basic_string::basic_string (, "<", ); > _7 = MEM[(char * *)]; > _8 = _7 + 18446744073709551592; > if (_8 != &_S_empty_rep_storage) > goto ; [10.00%] > else > goto ; [90.00%] Looks like you're using -D_GLIBCXX_USE_CXX11_ABI=0 but the OP is not.
[Bug target/88972] popcnt of limited 128-bit number with unnecessary zeroing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88972 Uroš Bizjak changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |INVALID --- Comment #2 from Uroš Bizjak --- This is by design. /* X86_TUNE_AVOID_FALSE_DEP_FOR_BMI: Avoid false dependency for bit-manipulation instructions. */ DEF_TUNE (X86_TUNE_AVOID_FALSE_DEP_FOR_BMI, "avoid_false_dep_for_bmi", m_SANDYBRIDGE | m_CORE_AVX2 | m_GENERIC)
[Bug libstdc++/88971] Branch optimization inconsistency (missed optimization)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88971 --- Comment #2 from Jonathan Wakely --- (In reply to Richard Biener from comment #1) > rid of, and even at -O3 we do not inline basic_string::basic_string it seems > (ISTR that is out-of-line in the library): There's an explicit instantiation in the library, but the definition is inline in the headers. If the compiler wanted to inline it, all the code is visible and nothing forces it to use the explicit instantiation in the library.
[Bug tree-optimization/88964] [8/9 Regression] ICE in wide_int_to_tree_1, at tree.c:1561
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88964 Jakub Jelinek changed: What|Removed |Added Attachment #45491|0 |1 is obsolete|| --- Comment #8 from Jakub Jelinek --- Created attachment 45493 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45493=edit gcc9-pr88964.patch Updated patch.
[Bug tree-optimization/88973] [8/9 Regression] New -Wrestrict warning since r268048
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88973 Richard Biener changed: What|Removed |Added Keywords||diagnostic Priority|P3 |P2 Known to work||8.2.0 Target Milestone|--- |8.3 Summary|New -Wrestrict warning |[8/9 Regression] New |since r268048 |-Wrestrict warning since ||r268048 Known to fail||8.2.1 --- Comment #1 from Richard Biener --- I believe the change was backported as well.
[Bug tree-optimization/88964] [8/9 Regression] ICE in wide_int_to_tree_1, at tree.c:1561
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88964 --- Comment #7 from Jakub Jelinek --- Actually no, with HONOR_SIGNED_ZEROS it shouldn't be optimized out. So, if we don't have other way how to make distinction between a normal chrec with step +0.0 and loop invariant var, we should punt at least for HONOR_SIGNED_ZEROS.
[Bug target/88972] popcnt of limited 128-bit number with unnecessary zeroing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88972 Richard Biener changed: What|Removed |Added Keywords||missed-optimization Target||x86_64-*-*, i?86-*-* Status|UNCONFIRMED |NEW Last reconfirmed||2019-01-22 Component|tree-optimization |target Ever confirmed|0 |1 --- Comment #1 from Richard Biener --- Err, __builtin_popcount has an integer argument so you call popcount on (int)m. The reason must be different. (insn 17 16 26 4 (parallel [ (set (reg:SI 88 [ ]) (popcount:SI (subreg:SI (reg/v:TI 89 [ m ]) 0))) (clobber (reg:CC 17 flags)) ]) "t.c":4 -1 (nil))
[Bug libstdc++/88971] Branch optimization inconsistency (missed optimization)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88971 Richard Biener changed: What|Removed |Added Keywords||missed-optimization CC||rguenth at gcc dot gnu.org Component|c++ |libstdc++ --- Comment #1 from Richard Biener --- This is because it still needs to generate the std::string objects at the caller site (outside of the if (print)). This involves quite some code to get rid of, and even at -O3 we do not inline basic_string::basic_string it seems (ISTR that is out-of-line in the library): __asm__ __volatile__("mfence" : : : "memory"); _6 = MEM[(const int *) + 4B]; if (_6 > 0) goto ; [41.48%] else goto ; [58.52%] [local count: 445388109]: std::basic_string::basic_string (, "<", ); _7 = MEM[(char * *)]; _8 = _7 + 18446744073709551592; if (_8 != &_S_empty_rep_storage) goto ; [10.00%] else goto ; [90.00%] [local count: 434030711]: goto ; [100.00%] [local count: 44538811]: if (__gthrw___pthread_key_create != 0B) goto ; [53.47%] else goto ; [46.53%] [local count: 23814902]: _9 = [(struct _Rep *)_7 + -24B].D.23940._M_refcount; _10 = __atomic_fetch_add_4 (_9, 4294967295, 4); _11 = (int) _10; goto ; [100.00%] [local count: 20723909]: __result_12 = MEM[(_Atomic_word *)_7 + -8B]; _13 = __result_12 + -1; MEM[(_Atomic_word *)_7 + -8B] = _13; [local count: 44538811]: # _14 = PHI <_11(6), __result_12(7)> if (_14 <= 0) goto ; [25.50%] else goto ; [74.50%] [local count: 11357397]: std::basic_string::_Rep::_M_destroy (_8, ); [local count: 445388108]: D.39206 ={v} {CLOBBER}; D.39204 ={v} {CLOBBER}; D.39205 ={v} {CLOBBER}; [local count: 1073741825]: __asm__ __volatile__("mfence" : : : "memory"); data ={v} {CLOBBER};
[Bug tree-optimization/88970] ICE: verify_ssa failed (error: definition in block 2 follows the use)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88970 Richard Biener changed: What|Removed |Added CC||jason at gcc dot gnu.org Version|unknown |9.0 --- Comment #2 from Richard Biener --- Looks like a missing/incomplete DECL_EXPR. ;; Function void d() (null) ;; enabled by -tree-original { typedef int e[0:(sizetype) SAVE_EXPR ]; ^^^ shouldn't this have (ssizetype) b (1) + -1)? int f[0:(sizetype) SAVE_EXPR ]; int c; typedef struct __lambda0 __lambda0; ssizetype D.2306; < (1) + -1) >; <];>>; int c; <::operator() (_EXPR ) >; }
[Bug tree-optimization/88964] [8/9 Regression] ICE in wide_int_to_tree_1, at tree.c:1561
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88964 --- Comment #6 from Jakub Jelinek --- In the spot which I'm changing IMHO shouldn't, that + 0.0 really should be folded (and if not, we should tweak create_iv not to do any addition if real_zerop). Though of course for other floating point IVs where the step is non-zero it could make a difference.
[Bug c++/88969] [9 Regression] ICE in build_op_delete_call, at cp/call.c:6509
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88969 Richard Biener changed: What|Removed |Added Priority|P3 |P1 Target Milestone|--- |9.0 Summary|ICE in |[9 Regression] ICE in |build_op_delete_call, at|build_op_delete_call, at |cp/call.c:6509 |cp/call.c:6509
[Bug c/88968] [8/9 Regression] Stack overflow in gimplify_expr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88968 Richard Biener changed: What|Removed |Added Priority|P3 |P2
[Bug target/88965] powerpc64le vector builtin hits ICE in verify_gimple
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88965 --- Comment #4 from Richard Biener --- LGTM
[Bug tree-optimization/88964] [8/9 Regression] ICE in wide_int_to_tree_1, at tree.c:1561
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88964 --- Comment #5 from Richard Biener --- Hmm, I wonder if handling FP inductions during interchange causes correctness issues as well (FP rounding, etc.). Otherwise the patch looks obvious.
[Bug tree-optimization/88862] [9 Regression] ICE in extract_affine, at graphite-sese-to-poly.c:313
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88862 Richard Biener changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #3 from Richard Biener --- Fixed.
[Bug tree-optimization/88862] [9 Regression] ICE in extract_affine, at graphite-sese-to-poly.c:313
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88862 --- Comment #4 from Richard Biener --- Author: rguenth Date: Tue Jan 22 11:28:56 2019 New Revision: 268147 URL: https://gcc.gnu.org/viewcvs?rev=268147=gcc=rev Log: 2019-01-22 Richard Biener PR tree-optimization/88862 * graphite-scop-detection.c (scop_detection::graphite_can_represent_scev): Reject ADDR_EXPR. Modified: trunk/gcc/ChangeLog trunk/gcc/graphite-scop-detection.c
[Bug c/88955] transparent_union for vector types not accepted
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88955 Richard Biener changed: What|Removed |Added Keywords||rejects-valid Status|UNCONFIRMED |NEW Last reconfirmed||2019-01-22 CC||hjl.tools at gmail dot com, ||jsm28 at gcc dot gnu.org, ||rguenth at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #2 from Richard Biener --- Hmm. I guess the "issue" is that the union has TImode rather than V2DImode. stor-layout doesn't look at TYPE_TRANSPARENT_AGGR at all though. Relevant is /* If we only have one real field; use its mode if that mode's size matches the type's size. This generally only applies to RECORD_TYPE. For UNION_TYPE, if the widest field is MODE_INT then use that mode. If the widest field is MODE_PARTIAL_INT, and the union will be passed by reference, then use that mode. */ poly_uint64 type_size; if ((TREE_CODE (type) == RECORD_TYPE || (TREE_CODE (type) == UNION_TYPE && (GET_MODE_CLASS (mode) == MODE_INT || (GET_MODE_CLASS (mode) == MODE_PARTIAL_INT && targetm.calls.pass_by_reference (pack_cumulative_args (0), mode, type, 0) && mode != VOIDmode && poly_int_tree_p (TYPE_SIZE (type), _size) && known_eq (GET_MODE_BITSIZE (mode), type_size)) ; else mode = mode_for_size_tree (TYPE_SIZE (type), MODE_INT, 1).else_blk (); where we reject vector modes. The C++ diagnostic is a bit more clear: > g++ t.c -S t.c:5:1: error: type transparent ‘union’ cannot be made transparent because the type of the first field has a different ABI from the class overall { ^ which hints at the implementation of the argument passing being the culprit for the restriction (not sure why the ABI of the class overall should matter given the docs of transparent_union say the ABI is specified by the first field...)
[Bug tree-optimization/88973] New: New -Wrestrict warning since r268048
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88973 Bug ID: 88973 Summary: New -Wrestrict warning since r268048 Product: gcc Version: 9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: marxin at gcc dot gnu.org CC: msebor at gcc dot gnu.org Target Milestone: --- Created attachment 45492 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45492=edit test-case The test-case comes from autogen package: $ gcc autogen.i -c -O2 -Werror=restrict In function ‘strcpy’, inlined from ‘canonicalize_pathname’ at autogen.i:10536:17, inlined from ‘option_pathfind.constprop’ at autogen.i:10420:32: autogen.i:4050:10: error: ‘__builtin_strcpy’ accessing 1 byte at offsets [0, 9223372036854775807] and [0, 9223372036854775807] may overlap 1 byte at offset 0 [-Werror=restrict] 4050 | return __builtin___strcpy_chk (__dest, __src, __builtin_object_size (__dest, 2 > 1)); | ^ cc1: some warnings being treated as errors Martin can you please verify that the warning is correct?
[Bug tree-optimization/88713] Vectorized code slow vs. flang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88713 --- Comment #27 from Chris Elrod --- g++ -mrecip=all -O3 -fno-signed-zeros -fassociative-math -freciprocal-math -fno-math-errno -ffinite-math-only -fno-trapping-math -fdump-tree-optimized -S -march=native -shared -fPIC -mprefer-vector-width=512 -fno-semantic-interposition -o gppvectorization_test.s vectorization_test.cpp is not enough to get vrsqrt. I need -funsafe-math-optimizations for the instruction to appear in the asm.
[Bug target/88954] __attribute__((noplt)) doesn't work with function pointers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88954 --- Comment #5 from Richard Biener --- For indirect calls the attributes on the function type pointed to a relevant. Unioning attributes from the actually called function (if the compiler can figure that out) can be appropriate depending on the actual attribute.
[Bug middle-end/88950] stack_protect_prologue can be reordered by sched1 around memory accesses
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88950 Matthew Malcomson changed: What|Removed |Added Known to fail||5.4.0 --- Comment #5 from Matthew Malcomson --- This problem has been around for a long time -- I have seen the same fundamental problem on gcc 5.4 (when looking for a version to put in the "known to work" field). With "gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.5) 5.4.0 20160609" on the same testcase, the stack_protect_test pattern gets reordered to before the second memory access (the "buf[b] = c" line), and again the stack protection does not guard this memory access. (insn:TI 8 126 16 (parallel [ (set (mem/v/f/c:DI (plus:DI (reg/f:DI 29 x29) (const_int 88 [0x58])) [1 D.2834+0 S8 A64]) (unspec:DI [ (mem/v/f/c:DI (reg/f:DI 3 x3 [100]) [1 __stack_chk_guard+0 S8 A64]) ] UNSPEC_SP_SET)) (set (reg:DI 5 x5 [126]) (const_int 0 [0])) ]) stack-reorder.c:1 864 {stack_protect_set_di} (expr_list:REG_UNUSED (reg:DI 5 x5 [126]) (nil))) (insn:TI 16 8 71 (set (mem/j:QI (plus:DI (reg:DI 0 x0 [105]) (const_int 4016 [0xfb0])) [0 buf S1 A8]) (reg:QI 4 x4 [106])) stack-reorder.c:3 45 {*movqi_aarch64} (expr_list:REG_DEAD (reg:QI 4 x4 [106]) (expr_list:REG_DEAD (reg:DI 0 x0 [105]) (nil (insn 71 16 22 (parallel [ (set (reg:DI 3 x3 [125]) (unspec:DI [ (mem/v/f/c:DI (plus:DI (reg/f:DI 29 x29) (const_int 88 [0x58])) [1 D.2834+0 S8 A64]) (mem/v/f/c:DI (reg/f:DI 3 x3 [100]) [1 __stack_chk_guard+0 S8 A64]) ] UNSPEC_SP_TEST)) (clobber (reg:DI 0 x0 [127])) ]) stack-reorder.c:14 866 {stack_protect_test_di} (expr_list:REG_UNUSED (reg:DI 0 x0 [127]) (nil))) (insn:TI 22 71 140 (set (mem/j:QI (plus:DI (reg:DI 1 x1 [110]) (const_int 4016 [0xfb0])) [0 buf S1 A8]) (reg:QI 2 x2 [ c ])) stack-reorder.c:4 45 {*movqi_aarch64} (expr_list:REG_DEAD (reg:QI 2 x2 [ c ]) (expr_list:REG_DEAD (reg:DI 1 x1 [110]) (nil
[Bug tree-optimization/88713] Vectorized code slow vs. flang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88713 --- Comment #26 from Chris Elrod --- > You can try enabling -mrecip to see RSQRT in .optimized - there's > probably late 1/sqrt optimization on RTL. No luck. The full commands I used: gfortran -Ofast -mrecip -S -fdump-tree-optimized -march=native -shared -fPIC -mprefer-vector-width=512 -fno-semantic-interposition -o gfortvectorizationdump.s vectorization_test.f90 g++ -mrecip -Ofast -fdump-tree-optimized -S -march=native -shared -fPIC -mprefer-vector-width=512 -fno-semantic-interposition -o gppvectorization_test.s vectorization_test.cpp g++'s output was similar: vect_U33_60.31_372 = SQRT (vect_S33_59.30_371); vect_Ui33_61.32_374 = { 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0 } / vect_U33_60.31_372; vect_U13_62.33_375 = vect_S13_47.24_359 * vect_Ui33_61.32_374; vect_U23_63.34_376 = vect_S23_53.27_365 * vect_Ui33_61.32_374; and it has the same assembly as gfortran for the rsqrt: vcmpps $4, %zmm0, %zmm5, %k1 vrsqrt14ps %zmm0, %zmm1{%k1}{z} vmulps %zmm0, %zmm1, %zmm2 vmulps %zmm1, %zmm2, %zmm0 vmulps %zmm6, %zmm2, %zmm2 vaddps %zmm7, %zmm0, %zmm0 vmulps %zmm2, %zmm0, %zmm0 vrcp14ps%zmm0, %zmm10 vmulps %zmm0, %zmm10, %zmm0 vmulps %zmm0, %zmm10, %zmm0 vaddps %zmm10, %zmm10, %zmm10 vsubps %zmm0, %zmm10, %zmm10
[Bug tree-optimization/88964] [8/9 Regression] ICE in wide_int_to_tree_1, at tree.c:1561
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88964 --- Comment #4 from Jakub Jelinek --- Created attachment 45491 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45491=edit gcc9-pr88964.patch Untested fix.
[Bug rtl-optimization/88953] Unrecognizable insn on architecture zEC12 with boost::bimap
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88953 Jakub Jelinek changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #6 from Jakub Jelinek --- Fixed then on all active branches.
[Bug target/88963] gcc generates terrible code for vectors of 64+ length which are not natively supported
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88963 --- Comment #9 from Devin Hussey --- (In reply to Andrew Pinski from comment #6) > Try using 128 (or 256) and you might see that aarch64 falls down similarly. yup. Oof. test: sub sp, sp, #560 stp x29, x30, [sp] mov x29, sp stp x19, x20, [sp, 16] mov x19, 128 mov x20, x0 add x0, sp, 176 str x21, [sp, 32] mov x21, x2 mov x2, x19 bl memcpy mov x2, x19 mov x1, x21 add x0, sp, 304 bl memcpy ldr q7, [sp, 176] mov x2, x19 ldr q6, [sp, 192] add x1, sp, 48 ldr q5, [sp, 208] mov x0, x20 ldr q4, [sp, 224] ldr q3, [sp, 240] ldr q2, [sp, 256] ldr q1, [sp, 272] ldr q0, [sp, 288] ldr q23, [sp, 304] ldr q22, [sp, 320] ldr q21, [sp, 336] ldr q20, [sp, 352] ldr q19, [sp, 368] ldr q18, [sp, 384] ldr q17, [sp, 400] ldr q16, [sp, 416] add v7.4s, v7.4s, v23.4s add v6.4s, v6.4s, v22.4s add v5.4s, v5.4s, v21.4s add v4.4s, v4.4s, v20.4s add v3.4s, v3.4s, v19.4s str q7, [sp, 48] add v2.4s, v2.4s, v18.4s str q6, [sp, 64] add v1.4s, v1.4s, v17.4s str q5, [sp, 80] add v0.4s, v0.4s, v16.4s str q4, [sp, 96] str q3, [sp, 112] str q2, [sp, 128] str q1, [sp, 144] str q0, [sp, 160] bl memcpy ldp x29, x30, [sp] ldp x19, x20, [sp, 16] ldr x21, [sp, 32] add sp, sp, 560 ret
[Bug tree-optimization/88713] Vectorized code slow vs. flang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88713 --- Comment #25 from rguenther at suse dot de --- On Tue, 22 Jan 2019, elrodc at gmail dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88713 > > --- Comment #24 from Chris Elrod --- > The dump looks like this: > > vect__67.78_217 = SQRT (vect__213.77_225); > vect_ui33_68.79_248 = { 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, > 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0 > } / vect__67.78_217; > vect__71.80_249 = vect__246.59_65 * vect_ui33_68.79_248; > vect_u13_73.81_250 = vect__187.71_14 * vect_ui33_68.79_248; > vect_u23_75.82_251 = vect__200.74_5 * vect_ui33_68.79_248; > > so the vrsqrt optimization happens later. g++ shows the same problems with > weird code generation. However this: > > /* sqrt(a) = -0.5 * a * rsqrtss(a) * (a * rsqrtss(a) * rsqrtss(a) - 3.0) > rsqrt(a) = -0.5 * rsqrtss(a) * (a * rsqrtss(a) * rsqrtss(a) - 3.0) */ > > does not match this: > > vrsqrt14ps %zmm1, %zmm2 # comparison and mask removed > vmulps %zmm1, %zmm2, %zmm0 > vmulps %zmm2, %zmm0, %zmm1 > vmulps %zmm6, %zmm0, %zmm0 > vaddps %zmm7, %zmm1, %zmm1 > vmulps %zmm0, %zmm1, %zmm1 > vrcp14ps%zmm1, %zmm0 > vmulps %zmm1, %zmm0, %zmm1 > vmulps %zmm1, %zmm0, %zmm1 > vaddps %zmm0, %zmm0, %zmm0 > vsubps %zmm1, %zmm0, %zmm0 > > Recommendations on the next place to look for what's going on? You can try enabling -mrecip to see RSQRT in .optimized - there's probably late 1/sqrt optimization on RTL.
[Bug tree-optimization/88972] New: popcnt of limited 128-bit number with unnecessary zeroing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88972 Bug ID: 88972 Summary: popcnt of limited 128-bit number with unnecessary zeroing Product: gcc Version: 9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: drepper.fsp+rhbz at gmail dot com Target Milestone: --- Compile the following code on x86-64 with -Ofast -march=haswell: int f(__uint128_t m) { if (m < 64000) return __builtin_popcount(m); return -1; } The generated code with the trunk gcc looks like this: 0: b8 ff f9 00 00 mov$0xf9ff,%eax 5: 48 39 f8cmp%rdi,%rax 8: b8 00 00 00 00 mov$0x0,%eax d: 48 19 f0sbb%rsi,%rax 10: 72 0e jb 20 12: 31 c0 xor%eax,%eax 14: f3 0f b8 c7 popcnt %edi,%eax 18: c3 retq 19: 0f 1f 80 00 00 00 00nopl 0x0(%rax) 20: b8 ff ff ff ff mov$0x,%eax 25: c3 retq The instruction at offset 12 is unnecessary. I guess this is a left-over from the popcnt of the upper half which is recognized to be unnecessary and left out. There is no addition anymore but somehow the register clearing survived.
[Bug middle-end/88950] stack_protect_prologue can be reordered by sched1 around memory accesses
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88950 ktkachov at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed|2019-01-21 00:00:00 |2019-01-22 CC||ktkachov at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #4 from ktkachov at gcc dot gnu.org --- Confirmed on aarch64 then.
[Bug tree-optimization/88713] Vectorized code slow vs. flang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88713 --- Comment #24 from Chris Elrod --- The dump looks like this: vect__67.78_217 = SQRT (vect__213.77_225); vect_ui33_68.79_248 = { 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0 } / vect__67.78_217; vect__71.80_249 = vect__246.59_65 * vect_ui33_68.79_248; vect_u13_73.81_250 = vect__187.71_14 * vect_ui33_68.79_248; vect_u23_75.82_251 = vect__200.74_5 * vect_ui33_68.79_248; so the vrsqrt optimization happens later. g++ shows the same problems with weird code generation. However this: /* sqrt(a) = -0.5 * a * rsqrtss(a) * (a * rsqrtss(a) * rsqrtss(a) - 3.0) rsqrt(a) = -0.5 * rsqrtss(a) * (a * rsqrtss(a) * rsqrtss(a) - 3.0) */ does not match this: vrsqrt14ps %zmm1, %zmm2 # comparison and mask removed vmulps %zmm1, %zmm2, %zmm0 vmulps %zmm2, %zmm0, %zmm1 vmulps %zmm6, %zmm0, %zmm0 vaddps %zmm7, %zmm1, %zmm1 vmulps %zmm0, %zmm1, %zmm1 vrcp14ps%zmm1, %zmm0 vmulps %zmm1, %zmm0, %zmm1 vmulps %zmm1, %zmm0, %zmm1 vaddps %zmm0, %zmm0, %zmm0 vsubps %zmm1, %zmm0, %zmm0 Recommendations on the next place to look for what's going on?
[Bug rtl-optimization/88948] [9 Regression] ICE in elimination_costs_in_insn, at reload1.c:3640 since r264148
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88948 --- Comment #2 from Uroš Bizjak --- The problem is with can_assign_to_reg_without_clobbers_p in gcse.c, where we have: /* If the test insn is valid and doesn't need clobbers, and the target also has no objections, we're good. */ if (icode >= 0 && (num_clobbers == 0 || !added_clobbers_hard_reg_p (icode)) && ! (targetm.cannot_copy_insn_p && targetm.cannot_copy_insn_p (test_insn))) can_assign = true; The test instruction is created as: (insn 26 0 0 (set (reg:SI 152) (fix:SI (reg:DF 89))) -1 (nil)) which is (correctly) recognized as (define_insn "fix_trunc_i387_fisttp" [(set (match_operand:SWI248x 0 "nonimmediate_operand" "=m") (fix:SWI248x (match_operand 1 "register_operand" "f"))) (clobber (match_scratch:XF 2 "="))] However, recog also reports that 1 clobber needs to be added. The instruction is recognized nevertheless due to "|| !added_clobbers_hard_reg_p (icode)" bypass. The recognized insn doesn't clobber hard reg, but it also needs a clobber of a scratch reg to be recognized.
[Bug rtl-optimization/88953] Unrecognizable insn on architecture zEC12 with boost::bimap
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88953 --- Comment #5 from Jan Kossmann --- You are right, I verified with: gcc version 9.0.0 20190122 (experimental) (GCC) COLLECT_GCC_OPTIONS='-v' '-save-temps' '-O3' '-o' 'test.cpp.o' '-shared-libgcc' '-march=z13' '-mno-htm' '-mzarch' '-m64' gcc/bin/../libexec/gcc/s390x-ibm-linux-gnu/9.0.0/cc1plus -E -quiet -v -imultiarch s390x-linux-gnu -iprefix gcc/bin/../lib/gcc/s390x-ibm-linux-gnu/9.0.0/ -D_GNU_SOURCE test.cpp -march=z13 -mno-htm -mzarch -m64 -O3 -fpch-preprocess -o test.ii and it worked out fine. Sorry for the trouble, thanks for your help!
[Bug c++/88971] New: Branch optimization inconsistency (missed optimization)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88971 Bug ID: 88971 Summary: Branch optimization inconsistency (missed optimization) Product: gcc Version: 8.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: maratrus at mail dot ru Target Milestone: --- Created attachment 45490 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45490=edit A code that demonstrates different patterns in optimization technique In the code attached I expect the compiler not to generate any code between two `mfence` instructions in the method `CheckAndPrint()`. Indeed, it does the good job if I call `PrintGood()` method and no code is generated. But if I out-comment `PrintBad()` or even simple return the compiler generates a code for the if-expression `if (t.j > 0)`. In all three cases there seems to be no reason to generate any code. The code attached is compiled as: `g++ -std=c++11 -Ofast opt_template.cc -o opt_template` I must be missing something but is there a good reason why the compiler managed to optimize the code in one case but non in the other two?
[Bug tree-optimization/88964] [8/9 Regression] ICE in wide_int_to_tree_1, at tree.c:1561
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88964 Jakub Jelinek changed: What|Removed |Added Status|NEW |ASSIGNED CC||jakub at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org --- Comment #3 from Jakub Jelinek --- --- gcc/gimple-loop-interchange.cc.jj 2019-01-01 12:37:17.416970701 +0100 +++ gcc/gimple-loop-interchange.cc 2019-01-22 11:34:42.303796570 +0100 @@ -692,7 +692,7 @@ loop_cand::analyze_induction_var (tree v iv->var = var; iv->init_val = init; iv->init_expr = chrec; - iv->step = build_int_cst (TREE_TYPE (chrec), 0); + iv->step = build_zero_cst (TREE_TYPE (chrec)); m_inductions.safe_push (iv); return true; } fixes this. SCEV is able to deal with non-integral/pointer IVs like SCALAR_FLOAT_TYPE_P in this case and create_iv as well, just build_int_cst must not be used in that case.
[Bug rtl-optimization/88953] Unrecognizable insn on architecture zEC12 with boost::bimap
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88953 --- Comment #4 from Andreas Krebbel --- Looks like a problem which was fixed with r265158: S/390: Fix problem with vec_init expander gcc/ChangeLog: 2018-10-15 Andreas Krebbel * config/s390/s390.c (s390_expand_vec_init): Force vector element into reg if it isn't a general operand. gcc/testsuite/ChangeLog: 2018-10-15 Andreas Krebbel * g++.dg/vec-init-1.C: New test. I've backported the patch to GCC 7 and 8 branch on 2018-10-19. Canonical is aware of the problem and will pick the patch up for their next GCC updates. Could you please check whether this fixes your problem?
[Bug fortran/37398] Statement functions mask missing PURE procedures.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=37398 --- Comment #4 from Dominique d'Humieres --- > This correctly gives the expected error messages since at least gfortran 5.4. > Closing as FIXED? FORALL(i=1:4) a(i) = st3 (i) is still not caught.
[Bug target/88906] wrong code with -march=k6 -minline-all-stringops -minline-stringops-dynamically -mmemcpy-strategy=libcall:-1:align and vector argument
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88906 Jakub Jelinek changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org --- Comment #9 from Jakub Jelinek --- Fixed on the trunk so far.
[Bug rtl-optimization/88904] [9 Regression] Basic block incorrectly skipped in jump threading.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88904 Jakub Jelinek changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #6 from Jakub Jelinek --- Fixed.
[Bug middle-end/88897] Bogus maybe-uninitialized warning on class field
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88897 Richard Biener changed: What|Removed |Added Keywords||missed-optimization Status|UNCONFIRMED |NEW Last reconfirmed||2019-01-22 Ever confirmed|0 |1 --- Comment #6 from Richard Biener --- So this boils down to a missed optimization (as many cases do...). The uninit warning sees [local count: 1073741825]: _3 = bar (); future_state::future_state (&_local_state); MEM[(struct &)&_local_state] ={v} {CLOBBER}; MEM[(struct optional *)&_local_state]._M_engaged = 0; MEM[(struct optional *)_3]._M_engaged = 0; _7 = MEM[(struct optional &)&_local_state]._M_engaged; if (_7 != 0) goto ; [50.00%] else goto ; [50.00%] [local count: 536870912]: _6 = MEM[(struct temporary_buffer &)&_local_state]._buffer; ... and warns about the load _6 = ... As you can see the condition isn't elided and somehow we didn't manage to CSE the load of _M_engaged here, possibly due to the appearant aliasing of the store via _3. points-to analysis explicitely says it might alias _local_state because _local_state escapes to future_state::future_state and PTA is not flow-sensitive: [local count: 1073741825]: # PT = nonlocal escaped null # USE = nonlocal null { D.2493 } (escaped) # CLB = nonlocal null { D.2493 } (escaped) _3 = bar (); # USE = nonlocal null { D.2493 } (escaped) # CLB = nonlocal null { D.2493 } (escaped) future_state::future_state (&_local_stateD.2493); MEM[(struct &)&_local_stateD.2493] ={v} {CLOBBER}; MEM[(struct optionalD.2409 *)&_local_stateD.2493]._M_engagedD.2426 = 0; MEM[(struct optionalD.2409 *)_3]._M_engagedD.2426 = 0;
[Bug preprocessor/88966] Indirect stringification of "linux" produces "1"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88966 Jonathan Wakely changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |INVALID --- Comment #5 from Jonathan Wakely --- This is not a bug, "linux" is a predefined macro and the preprocessor is doing exactly what it's supposed to. See https://gcc.gnu.org/onlinedocs/cpp/System-specific-Predefined-Macros.html
[Bug target/88963] gcc generates terrible code for vectors of 64+ length which are not natively supported
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88963 --- Comment #8 from Richard Biener --- You can try the attached patch, it "fixes" the issue on the GIMPLE side but appearantly the BIT_FIELD_REF stores go a weird path during RTL expansion and so we end up spilling again.
[Bug target/88963] gcc generates terrible code for vectors of 64+ length which are not natively supported
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88963 --- Comment #7 from Marc Glisse --- See PR 55266 (and several others).
[Bug tree-optimization/88862] [9 Regression] ICE in extract_affine, at graphite-sese-to-poly.c:313
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88862 --- Comment #2 from Richard Biener --- Huh. We get here from originally (integer(kind=4)) The stmt we analyze is if (_4 != _316) I have a simple patch.
[Bug tree-optimization/88044] [9 regression] gfortran.dg/transfer_intrinsic_3.f90 hangs after r266171
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88044 Jakub Jelinek changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org --- Comment #16 from Jakub Jelinek --- Fixed.