[Bug tree-optimization/100171] autovectorizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100171 Richard Biener changed: What|Removed |Added CC||hubicka at gcc dot gnu.org, ||rguenth at gcc dot gnu.org --- Comment #2 from Richard Biener --- Well, the issue is that we end up with (for the simplest case): [local count: 357878152]: _15 = MEM [(const value_type &)arg_3(D)][0]; _16 = MEM [(value_type &)out_2(D)][0]; _17 = _15 + _16; MEM [(value_type &)out_2(D)][0] = _17; _22 = MEM [(const value_type &)arg_3(D)][1]; _23 = MEM [(value_type &)out_2(D)][1]; _24 = _22 + _23; MEM [(value_type &)out_2(D)][1] = _24; return; and the first store into out[0] can end up writing to arg[1]. I don't see what we can easily do here. Path based disambiguation could maybe argue that partial overlaps of value_type are not allowed.
[Bug tree-optimization/100173] telecom/viterb00data_1 has 16.92% regression compared O2 -ftree-vectorize -fvect-cost-model=very-cheap to O2 on CLX/ICX, 9% regression on znver3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100173 Richard Biener changed: What|Removed |Added Keywords||missed-optimization --- Comment #1 from Richard Biener --- Note that store commoning of code sinking will sink the last store anyway: @@ -546,7 +233,6 @@ _27 = _26 << 1; _28 = (short int) _27; _29 = _28 | 1; - MEM[(struct StatePathMetricData *)pOut_90 + 4B].m_esState = _29; goto ; [100.00%] [local count: 505302904]: @@ -556,21 +242,19 @@ _32 = _31 << 1; _33 = (short int) _32; _34 = _33 | 1; - MEM[(struct StatePathMetricData *)pOut_90 + 4B].m_esState = _34; [local count: 1010605809]: + # _94 = PHI <_29(7), _34(8)> + MEM[(struct StatePathMetricData *)pOut_90 + 4B].m_esState = _94; but yes, cselim will also sink the first store, moving it across the scalar compute in the block. I might note that ideally we'd sink all the compute as well and end up with just a conditional load of either pIn1->m_esState or pIn2_89->m_esState. That might then allow scheduling to recover the original performance. You can try that as a source transform, like e_s16 tem1, tem2; if (esMetric1 >=esMetric2) { tem1 = esMetric1; tem2 = pIn1->m_esState; } else { tem1 = esMetric2; tem2 = pIn2->m_esState; } pOut->m_esPathMetric =tem1; pOut->m_esState = (tem2 << 1) | 1;
[Bug translation/100174] New: Binary floating-point conversion under source-gcc/gcc/real.[c\h] test on x86-64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100174 Bug ID: 100174 Summary: Binary floating-point conversion under source-gcc/gcc/real.[c\h] test on x86-64 Product: gcc Version: 9.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: translation Assignee: unassigned at gcc dot gnu.org Reporter: 608410104 at alum dot ccu.edu.tw Target Milestone: --- For example: float a = 0. I don't know why gcc using clear_significand_below(real.c) function to clear remaining bits in sig[SIGSZ-1]. Why not just only truncated it by it need. > Before clear sig[SIGSZ-1] = 01010101 01001100 10011000 0101 0110 > 0110 10010100 0100011 > After clear sig[SIGSZ-1] = 01010101 01001100 10011000 > 000 Sorry if I am not clear, I will elaborate if this is still unclear.
[Bug libstdc++/100164] [11/12 Regression] semaphore_impl not declared on AIX
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100164 Richard Biener changed: What|Removed |Added Summary|[11 Regression] |[11/12 Regression] |semaphore_impl not declared |semaphore_impl not declared |on AIX |on AIX Target Milestone|--- |11.0
[Bug testsuite/100159] Typos in testsuite files
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100159 Martin Liška changed: What|Removed |Added CC||marxin at gcc dot gnu.org, ||msebor at gcc dot gnu.org Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Last reconfirmed||2021-04-21 --- Comment #1 from Martin Liška --- @Martin: Can you please take a look?
[Bug fortran/100156] ICE in gfc_trans_array_cobounds, at fortran/trans-array.c:6257
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100156 Martin Liška changed: What|Removed |Added CC||marxin at gcc dot gnu.org Ever confirmed|0 |1 Last reconfirmed||2021-04-21 Status|UNCONFIRMED |NEW --- Comment #1 from Martin Liška --- Fails also with GCC 4.8.0.
[Bug c++/100157] Support `__type_pack_element` like Clang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100157 Richard Biener changed: What|Removed |Added Version|unknown |12.0 Severity|normal |enhancement
[Bug fortran/100155] [9/10/11 Regression] ICE in gfc_conv_intrinsic_size, at fortran/trans-intrinsic.c:805
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100155 Richard Biener changed: What|Removed |Added Priority|P3 |P4 Target Milestone|--- |9.4
[Bug fortran/100155] [9/10/11 Regression] ICE in gfc_conv_intrinsic_size, at fortran/trans-intrinsic.c:805
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100155 Martin Liška changed: What|Removed |Added CC||burnus at gcc dot gnu.org, ||marxin at gcc dot gnu.org Status|UNCONFIRMED |NEW Last reconfirmed||2021-04-21 Ever confirmed|0 |1 --- Comment #2 from Martin Liška --- Started with r9-3522-gd0477233215e37de.
[Bug fortran/100154] [9/10/11 Regression] ICE in gfc_conv_procedure_call, at fortran/trans-expr.c:6131
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100154 Martin Liška changed: What|Removed |Added CC||marxin at gcc dot gnu.org, ||tkoenig at gcc dot gnu.org --- Comment #3 from Martin Liška --- Started with r9-3030-g056e6860b3a3f915.
[Bug c/100150] ice in bp_unpack_string
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100150 --- Comment #12 from Martin Liška --- (In reply to David Binderman from comment #9) > From a different fedora package build, I have a much simpler test case: > > $ /home/dcb/gcc/results/bin/gcc grtter.o gruser.o > > Two object modules attached. Again, please attach pre-processed source files and the corresponding command lines.
[Bug other/44032] internals documentation is not legally safe to use
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44032 --- Comment #9 from Eric Gallager --- (In reply to Eric Gallager from comment #8) > (In reply to Eric Gallager from comment #7) > > Richard says the FSF doesn't object to combinations of GFDL code from the > > manual with GPL code from the source and that we can put a statement to this > > effect in the internals manual. > > So, now that RMS is out at the FSF... does what he have to say on this issue > even matter any longer, or do we have to ask someone else at the FSF now? Er, let me amend this: he's back in at the FSF, but out of the GCC Steering Committee... so I guess the question still remains, though: is his opinion the one that matters, or do we have to ask someone else?
[Bug c/67224] UTF-8 support for identifier names in GCC
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67224 Eric Gallager changed: What|Removed |Added CC||branning at gmail dot com, ||development at jordi dot vilar.cat ||, dwolf at dannad dot de, ||egallager at gcc dot gnu.org, ||spoa at eircom dot net --- Comment #37 from Eric Gallager --- Redoing a few CCs that got removed without being marked as removed in the bug history; presumably from the server migration
[Bug tree-optimization/100173] New: telecom/viterb00data_1 has 16.92% regression compared O2 -ftree-vectorize -fvect-cost-model=very-cheap to O2 on CLX/ICX, 9% regression on znver3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100173 Bug ID: 100173 Summary: telecom/viterb00data_1 has 16.92% regression compared O2 -ftree-vectorize -fvect-cost-model=very-cheap to O2 on CLX/ICX, 9% regression on znver3 Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: crazylht at gmail dot com CC: hjl.tools at gmail dot com Target Milestone: --- Host: x86_64-pc-linux-gnu Target: x86_64-*-* i?86-*-* Created attachment 50647 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50647&action=edit ACS.cpp cat testcase void __attribute__ ((noipa)) ACS(e_s16 *pBranchMetric) { n_int i; e_s16 esMetricIn, esMetric1, esMetric2; StatePathMetricData *pIn1 = BufPtr[BufSelector]; StatePathMetricData *pIn2 = pIn1 + (1<<5)/2; StatePathMetricData *pOut = BufPtr[1 - BufSelector]; BufSelector ^= 1; for (i = 0; i < (1<<5)/2; i++) { esMetricIn = *pBranchMetric++; esMetric1 = pIn1->m_esPathMetric - esMetricIn; esMetric2 = pIn2->m_esPathMetric + esMetricIn; if (esMetric1 >= esMetric2) { pOut->m_esPathMetric = esMetric1; pOut->m_esState = (pIn1->m_esState << 1); } else { pOut->m_esPathMetric = esMetric2; pOut->m_esState = (pIn2->m_esState << 1); } pOut++; esMetric1 = pIn1->m_esPathMetric + esMetricIn; esMetric2 = pIn2->m_esPathMetric - esMetricIn; if (esMetric1 >=esMetric2) { pOut->m_esPathMetric =esMetric1; pOut->m_esState = (pIn1->m_esState << 1) | 1; } else { pOut->m_esPathMetric =esMetric2; pOut->m_esState = (pIn2->m_esState << 1) | 1; } pOut++; pIn1++; pIn2++; } } It is if conditional store replacement plays here, it sinks 2 stores from IF_BB and ELSE_BB to JOIN_BB since they have same address. But failed to vectorize them with -fvect-cost-model=very-cheap, and it causes worse IPC for consecutive stores in JOIN_BB on both ICX and znver3. With -fvect-cost-model=cheap, the loop can be vectorized and 2.6x faster than O2. So I think we should either vectorize this loop or not sink conditional stores when cost model is very-cheap. and the codes related are here: /* If either vectorization or if-conversion is disabled then do not sink any stores. */ if (param_max_stores_to_sink == 0 || (!flag_tree_loop_vectorize && !flag_tree_slp_vectorize) || !flag_tree_loop_if_convert) return false;
[Bug tree-optimization/100171] autovectorizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100171 Andrew Pinski changed: What|Removed |Added Version|unknown |11.0 Severity|normal |enhancement Keywords||alias Component|c++ |tree-optimization --- Comment #1 from Andrew Pinski --- There is an aliasing issue with the += case. I Noticed that even clang does not auto-vectorizes the exe_self_* cases either.
[Bug libstdc++/100164] [11 Regression] semaphore_impl not declared on AIX
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100164 Thomas Rodgers changed: What|Removed |Added Attachment #50643|0 |1 is obsolete|| Attachment #50645|0 |1 is obsolete|| --- Comment #10 from Thomas Rodgers --- Created attachment 50646 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50646&action=edit Work around broken macro name This patch works around the borked macro name that David pointed out on the mailing list. I left in the commented out _GLIBCXX_HAVE_POSIX_SEMAPHORE checks and have temporarily replaced them with _GLIBCXX__GLIBCXX_HAVE_POSIX_SEMAPHORE, and forced the posix semaphore test to always run if that macro is defined. I am not sufficiently versed in the arcane ways in which config.h is transformed to c++config.h but the borked macro transformation ideally should be fixed and the commented out checks restored.
[Bug fortran/100149] Seg fault passing to CHARACTER(*), DIMENSION(*), INTENT(IN), OPTIONAL
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100149 --- Comment #2 from Scot Breitenfeld --- Thanks for the update; it is good to know that it was fixed in 11.0. I also tried it with GCC master (4.20.2021), and it worked. This is for an open-source library (CGNS), and it is a commonly used API; it is a show-stopper bug. Thanks again.
[Bug fortran/99982] INTERFACE selects wrong module procedure involving C_PTR and C_FUNPTR
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99982 --- Comment #1 from Scot Breitenfeld --- I checked with gcc master (4/20/2021), and it still has the same issue.
[Bug c++/100172] ICE with "concept concept" keyword in struct
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100172 --- Comment #2 from 康桓瑋 --- The struct bit is a red herring. It can be boiled down to just two concept keywords: https://godbolt.org/z/sW7vr3sso concept concept; :1:1: warning: C++20 concept definition syntax is 'concept = ' 1 | concept concept; | ^~~ :1:9: warning: C++20 concept definition syntax is 'concept = ' 1 | concept concept; | ^~~ ' Segmentation fault 0x1d030c9 internal_error(char const*, ...) ???:0 0x1d1eb4b pp_format(pretty_printer*, text_info*) ???:0 0x1d01c4a diagnostic_report_diagnostic(diagnostic_context*, diagnostic_info*) ???:0 0x1d049e6 error_at(rich_location*, char const*, ...) ???:0 0x8e261d c_parse_file() ???:0 0xa62752 c_common_parse_file() ???:0 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report.
[Bug c++/100172] ICE with "concept concept" keyword in struct
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100172 --- Comment #1 from 康桓瑋 --- And gcc-trunk accepts this non-sense snippet: https://godbolt.org/z/PbTa55eTx void f(auto) { struct S { concept enum E {}; }; []() requires S::E {}; } template void f(int);
[Bug target/100152] Possible 10.3 bad code generation regression from 10.2/9.3 on Mac OS 10.15.7 (Catalina)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100152 Gabriel Ravier changed: What|Removed |Added CC||gabravier at gmail dot com --- Comment #14 from Gabriel Ravier --- (In reply to lucier from comment #13) > (In reply to Iain Sandoe from comment #8) > > the values of rbp. r10 and esi would be interesting too. > > I'm not really familiar with assembler, don't know what register esi is > > [...] > >rsi = 0x002f As a side note, if you want to know, esi is the lower 32 bits of rsi, which in this case would be 0x2f (same as rsi since the upper 32 bits are 0s).
[Bug c++/100172] New: ICE with "concept concept" keyword in struct
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100172 Bug ID: 100172 Summary: ICE with "concept concept" keyword in struct Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: hewillk at gmail dot com Target Milestone: --- The following only shows a warning, which may be related to fixed PR97536. struct { concept concept; }; :2:3: warning: C++20 concept definition syntax is 'concept = ' 2 | concept concept; | ^~~ :2:11: warning: C++20 concept definition syntax is 'concept = ' 2 | concept concept; | ^~~ ' Segmentation fault 0x1d030c9 internal_error(char const*, ...) ???:0 0x1d1eb4b pp_format(pretty_printer*, text_info*) ???:0 0x1d01c4a diagnostic_report_diagnostic(diagnostic_context*, diagnostic_info*) ???:0 0x1d049e6 error_at(rich_location*, char const*, ...) ???:0 0x8e261d c_parse_file() ???:0 0xa62752 c_common_parse_file() ???:0 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report.
[Bug tree-optimization/95409] Failure to xor register before usage of 8-bit part in some bitshifting situations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95409 Andrew Pinski changed: What|Removed |Added Last reconfirmed||2021-04-21 Status|UNCONFIRMED |NEW Severity|normal |enhancement Component|target |tree-optimization Ever confirmed|0 |1 --- Comment #1 from Andrew Pinski --- Confirmed. They produce at the tree level: _1 = ~x_7(D); _4 = x.2_2 < y.3_3; _5 = (int) _4; _6 = _1 & _5; _9 = (bool) _6; _3 = x.4_1 < y.5_2; _11 = ~x_7(D); _5 = (bool) _11; _6 = _3 & _5; --- That is it does not convert: _5 = (int) _4; _6 = _1 & _5; _9 = (bool) _6; Into: _t = (bool) _1; _6 = _t & _4;
[Bug tree-optimization/95410] Failure to optimize compare next to and properly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95410 --- Comment #2 from Andrew Pinski --- (In reply to Andrew Pinski from comment #1) > I notice the other two don't produce the same tree level either (but the > same assembly code in the end): Oh the same assembly in the end was on aarch64, on x86_64 they are not. Oh and that problem is PR 95409. I will put my analysis of the differences there.
[Bug tree-optimization/95410] Failure to optimize compare next to and properly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95410 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Last reconfirmed||2021-04-21 --- Comment #1 from Andrew Pinski --- Confirmed. We have: if (x.0_1 >= y.1_2) goto ; [34.00%] else goto ; [66.00%] [local count: 708669601]: _10 = ~x_5(D); _7 = (bool) _10; [local count: 1073741824]: # _4 = PHI <0(2), _7(3)> Which obvious can be converted to: _10 = ~x_5(D); _7 = (bool) _t = x.0_1 >= y.1_2 _t1 = ~_t _4 = _t1 & _7 CUT I notice the other two don't produce the same tree level either (but the same assembly code in the end): _1 = ~x_7(D); _4 = x.2_2 < y.3_3; _5 = (int) _4; _6 = _1 & _5; _9 = (bool) _6; _3 = x.4_1 < y.5_2; _11 = ~x_7(D); _5 = (bool) _11; _6 = _3 & _5; --- That is it does not convert: _5 = (int) _4; _6 = _1 & _5; _9 = (bool) _6; Into: _t = (bool) _1; _6 = _t & _4;
[Bug c++/100171] New: autovectorizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100171 Bug ID: 100171 Summary: autovectorizer Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: g.peterh...@t-online.de Target Milestone: --- Hello gcc team, I once wrote a small test case to show the problems with the autovectorizer https://godbolt.org/z/xs35P45MM . In particular, the += operator is not vectorized. The + operator works in the same context. I do not understand that. If you decrement the arraysize in foo from 2 to 1 it doesn't work at all anymore - scalar operations are always generated for ARR_2x. In general, I made the experience that the autovectorizer starts much too late. It should always do this from 2 values, even if these are much smaller than a simd register. This also saves a lot of memory accesses - especially when the data is linear in the memory (as in the example). Usually, however, vectorization is only carried out when the data is at least as large as a simd register, but often only when it is twice or even four times as large. I think you should urgently update/optimize the autovectorizer. thx & regards Gero
[Bug libstdc++/97600] [ranges] satisfaction value of range affected by prior use of basic_istream_view::begin()
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97600 --- Comment #5 from CVS Commits --- The releases/gcc-10 branch has been updated by Patrick Palka : https://gcc.gnu.org/g:3d6bba85e1dd6cb7e213a7e6c060b9c8a0a346e2 commit r10-9737-g3d6bba85e1dd6cb7e213a7e6c060b9c8a0a346e2 Author: Patrick Palka Date: Fri Oct 30 20:33:19 2020 -0400 libstdc++: Don't initialize from *this inside some views [PR97600] This works around a subtle issue where instantiating the begin()/end() member of some views (as part of return type deduction) inadvertently requires computing the satisfaction value of range. This is problematic because the constraint range requires the begin()/end() member to be callable. But it's not callable until we've deduced its return type, so evaluation of range yields false at this point. And if after both members are instantiated (and their return types deduced) we evaluate range again, this time it will yield true since the begin()/end() members are now both callable. This makes the program ill-formed according to [temp.constr.atomic]/3: If, at different points in the program, the satisfaction result is different for identical atomic constraints and template arguments, the program is ill-formed, no diagnostic required. The views affected by this issue are those whose begin()/end() member has a placeholder return type and that member initializes an _Iterator or _Sentinel object from a reference to *this. The second condition is relevant because it means explicit conversion functions are considered during overload resolution (as per [over.match.copy], I think), and therefore it causes g++ to check the constraints of the conversion function view_interface::operator bool(). And this conversion function's constraints indirectly require range. This issue is observable on trunk only with basic_istream_view (as in the testcase in the PR). But a pending patch that makes g++ memoize constraint satisfaction values indefinitely (it currently invalidates the satisfaction cache on various events) causes many existing tests for the other affected views to fail, because range then remains false for the whole compilation. This patch works around this issue by adjusting the constructors of the _Iterator and _Sentinel types of the affected views to take their foo_view argument by pointer instead of by reference, so that g++ no longer considers explicit conversion functions when resolving the direct-initialization inside these views' begin()/end() members. libstdc++-v3/ChangeLog: PR libstdc++/97600 * include/std/ranges (basic_istream_view::begin): Initialize _Iterator from 'this' instead of '*this'. (basic_istream_view::_Iterator::_Iterator): Adjust constructor accordingly. (filter_view::_Iterator::_Iterator): Take a filter_view* argument instead of a filter_view& argument. (filter_view::_Sentinel::_Sentinel): Likewise. (filter_view::begin): Initialize _Iterator from 'this' instead of '*this'. (filter_view::end): Likewise. (transform_view::_Iterator::_Iterator): Take a _Parent* instead of a _Parent&. (filter_view::_Iterator::operator+): Adjust accordingly. (filter_view::_Iterator::operator-): Likewise. (filter_view::begin): Initialize _Iterator from 'this' instead of '*this'. (filter_view::end): Likewise. (join_view::_Iterator): Take a _Parent* instead of a _Parent&. (join_view::_Sentinel): Likewise. (join_view::begin): Initialize _Iterator from 'this' instead of '*this'. (join_view::end): Initialize _Sentinel from 'this' instead of '*this'. (split_view::_OuterIter): Take a _Parent& instead of a _Parent*. (split_view::begin): Initialize _OuterIter from 'this' instead of '*this'. (split_view::end): Likewise. * testsuite/std/ranges/97600.cc: New test. (cherry picked from commit afb8da7faa9dfe5a0d94ed45a373d74c076784ab)
[Bug testsuite/100170] New: Gcc tests gcc.target/powerpc/ppc-{eq,ne}0-1.c fail on Power10
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100170 Bug ID: 100170 Summary: Gcc tests gcc.target/powerpc/ppc-{eq,ne}0-1.c fail on Power10 Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: testsuite Assignee: unassigned at gcc dot gnu.org Reporter: meissner at gcc dot gnu.org Target Milestone: --- If you configure a compiler where the default code generation is power10, the tests ppc-ne0-1.c and ppc-eq0-1.c fail. The reason is with power10, the compiler generates setbc and setbcr instead of the expected cntlzw, isel, addic, subfe, and addze instructions.
[Bug testsuite/100166] Some vold-vec-{load,store} tests fail when built with compiler configured with --with-cpu=power10
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100166 --- Comment #2 from Michael Meissner --- gcc.target/powerpc/lvsl-lvsr.c is another test that needs prefixed load/store support added.
[Bug testsuite/100166] Some vold-vec-{load,store} tests fail when built with compiler configured with --with-cpu=power10
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100166 --- Comment #1 from Michael Meissner --- The test gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a-pr63175.c also fails because it needs to add prefixed instruction support.
[Bug testsuite/100169] New: Test gcc.dg/sms-10.c fails on power10
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100169 Bug ID: 100169 Summary: Test gcc.dg/sms-10.c fails on power10 Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: testsuite Assignee: unassigned at gcc dot gnu.org Reporter: meissner at gcc dot gnu.org Target Milestone: --- I was doing a comparison between tests between a compiler configured for power10 and one configure for power9. The test gcc.dg/sms-10.c fails on power10. The SMS pass fails on power10, and succeeds on power9.
[Bug testsuite/100168] New: Test gcc.dg/pr56727-2.c fails on power10 code generation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100168 Bug ID: 100168 Summary: Test gcc.dg/pr56727-2.c fails on power10 code generation Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: testsuite Assignee: unassigned at gcc dot gnu.org Reporter: meissner at gcc dot gnu.org Target Milestone: --- The test gcc.dg/pr56727-2.c fails if you build a compiler with default power10 code generation (--with-cpu=power10). The test fails because it is expecting TOC calls (call with @plt and a NOP after the call). In power10, the default code generation now assumes the use of pc-relative calls and does not use the TOC or have a NOP after the call.
[Bug testsuite/100167] New: GCC configured for power10 fails the gcc.target/powerpc/fold-vec-div-longlong.c test
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100167 Bug ID: 100167 Summary: GCC configured for power10 fails the gcc.target/powerpc/fold-vec-div-longlong.c test Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: testsuite Assignee: unassigned at gcc dot gnu.org Reporter: meissner at gcc dot gnu.org Target Milestone: --- If you configure GCC using --with-cpu=power10, the fold-vec-div-longlong.c fails. This is due to the test being written for an earlier generation of Power computer where it needs to move the vector elements over to the GPR registers to do vector long long divide. With power10 code generation, the code is replaced by a single 'vdivsd' or 'vdivud' instruction. The test needs to be adjusted to disable power10 code generation.
[Bug testsuite/100166] New: Some vold-vec-{load,store} tests fail when built with compiler configured with --with-cpu=power10
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100166 Bug ID: 100166 Summary: Some vold-vec-{load,store} tests fail when built with compiler configured with --with-cpu=power10 Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: testsuite Assignee: unassigned at gcc dot gnu.org Reporter: meissner at gcc dot gnu.org Target Milestone: --- I noticed there are a bunch of the fold-vec-load* and fold-vec-store* tests that fail if you configure the compiler to default to power10 code generation. It looks like the regexp that matches the loads and stores need to be updated to know about the prefixed loads and stores: The failing tests in this category are: FAIL: gcc.target/powerpc/fold-vec-load-builtin_vec_xl-char.c scan-assembler-times \\mlxvw4x\\M|\\mlxvd2x\\M|\\mlxvx\\M|\\mlvx\\M 12 FAIL: gcc.target/powerpc/fold-vec-load-builtin_vec_xl-double.c scan-assembler-times \\mlxvd2x\\M|\\mlxvx\\M|\\mlvx\\M 6 FAIL: gcc.target/powerpc/fold-vec-load-builtin_vec_xl-float.c scan-assembler-times \\mlxvw4x\\M|\\mlxvd2x\\M|\\mlxvx\\M|\\mlvx\\M 6 FAIL: gcc.target/powerpc/fold-vec-load-builtin_vec_xl-int.c scan-assembler-times \\mlxvw4x\\M|\\mlxvd2x\\M|\\mlxvx\\M|\\mlvx\\M 12 FAIL: gcc.target/powerpc/fold-vec-load-builtin_vec_xl-longlong.c scan-assembler-times \\mlxvd2x\\M|\\mlxvx\\M|\\mlvx\\M 12 FAIL: gcc.target/powerpc/fold-vec-load-builtin_vec_xl-short.c scan-assembler-times \\mlxvw4x\\M|\\mlxvd2x\\M|\\mlxvx\\M|\\mlvx\\M 12 FAIL: gcc.target/powerpc/fold-vec-load-vec_vsx_ld-char.c scan-assembler-times \\mlxvw4x\\M|\\mlxvd2x\\M|\\mlxvx\\M|\\mlvx\\M 12 FAIL: gcc.target/powerpc/fold-vec-load-vec_vsx_ld-double.c scan-assembler-times \\mlxvd2x\\M|\\mlxvx\\M|\\mlvx\\M 6 FAIL: gcc.target/powerpc/fold-vec-load-vec_vsx_ld-float.c scan-assembler-times \\mlxvw4x\\M|\\mlxvd2x\\M|\\mlxvx\\M|\\mlvx\\M 6 FAIL: gcc.target/powerpc/fold-vec-load-vec_vsx_ld-int.c scan-assembler-times \\mlxvw4x\\M|\\mlxvd2x\\M|\\mlxvx\\M|\\mlvx\\M 12 FAIL: gcc.target/powerpc/fold-vec-load-vec_vsx_ld-longlong.c scan-assembler-times \\mlxvd2x\\M|\\mlxvx\\M|\\mlvx\\M 12 FAIL: gcc.target/powerpc/fold-vec-load-vec_vsx_ld-short.c scan-assembler-times \\mlxvw4x\\M|\\mlxvd2x\\M|\\mlxvx\\M|\\mlvx\\M 12 FAIL: gcc.target/powerpc/fold-vec-load-vec_xl-char.c scan-assembler-times \\mlxvw4x\\M|\\mlxvd2x\\M|\\mlxvx\\M|\\mlvx\\M 12 FAIL: gcc.target/powerpc/fold-vec-load-vec_xl-double.c scan-assembler-times \\mlxvd2x\\M|\\mlxvx\\M|\\mlvx\\M 6 FAIL: gcc.target/powerpc/fold-vec-load-vec_xl-float.c scan-assembler-times \\mlxvw4x\\M|\\mlxvd2x\\M|\\mlxvx\\M|\\mlvx\\M 6 FAIL: gcc.target/powerpc/fold-vec-load-vec_xl-int.c scan-assembler-times \\mlxvw4x\\M|\\mlxvd2x\\M|\\mlxvx\\M|\\mlvx\\M 12 FAIL: gcc.target/powerpc/fold-vec-load-vec_xl-longlong.c scan-assembler-times \\mlxvd2x\\M|\\mlxvx\\M|\\mlvx\\M 12 FAIL: gcc.target/powerpc/fold-vec-load-vec_xl-short.c scan-assembler-times \\mlxvw4x\\M|\\mlxvd2x\\M|\\mlxvx\\M|\\mlvx\\M 12 FAIL: gcc.target/powerpc/fold-vec-store-vec_vsx_st-int.c scan-assembler-times \\mstxvw4x\\M|\\mstxvd2x\\M|\\mstxvx\\M|\\mstvx\\M 12 FAIL: gcc.target/powerpc/fold-vec-store-vec_vsx_st-longlong.c scan-assembler-times \\mstxvd2x\\M|\\mstxvx\\M|\\mstvx\\M 12 FAIL: gcc.target/powerpc/fold-vec-store-vec_vsx_st-short.c scan-assembler-times \\mstxvw4x\\M|\\mstxvd2x\\M|\\mstxvx\\M|\\mstvx\\M 12 FAIL: gcc.target/powerpc/fold-vec-store-vec_xst-char.c scan-assembler-times \\mstxvw4x\\M|\\mstxvd2x\\M|\\mstxvx\\M|\\mstvx\\M 12 FAIL: gcc.target/powerpc/fold-vec-store-vec_xst-double.c scan-assembler-times \\mstxvd2x\\M|\\mstxvx\\M|\\mstvx\\M 6 FAIL: gcc.target/powerpc/fold-vec-store-vec_xst-float.c scan-assembler-times \\mstxvd2x\\M|\\mstxvx\\M|\\mstvx\\M 6 FAIL: gcc.target/powerpc/fold-vec-store-vec_xst-int.c scan-assembler-times \\mstxvw4x\\M|\\mstxvd2x\\M|\\mstxvx\\M|\\mstvx\\M 12 FAIL: gcc.target/powerpc/fold-vec-store-vec_xst-longlong.c scan-assembler-times \\mstxvd2x\\M|\\mstxvx\\M|\\mstvx\\M 12 FAIL: gcc.target/powerpc/fold-vec-store-vec_xst-short.c scan-assembler-times \\mstxvw4x\\M|\\mstxvd2x\\M|\\mstxvx\\M|\\mstvx\\M 12 FAIL: gcc.target/powerpc/fold-vec-store-builtin_vec_xst-char.c scan-assembler-times \\mstxvw4x\\M|\\mstxvd2x\\M|\\mstxvx\\M|\\mstvx\\M 12 FAIL: gcc.target/powerpc/fold-vec-store-builtin_vec_xst-double.c scan-assembler-times \\mstxvd2x\\M|\\mstxvx\\M|\\mstvx\\M 6 FAIL: gcc.target/powerpc/fold-vec-store-builtin_vec_xst-float.c scan-assembler-times \\mstxvd2x\\M|\\mstxvx\\M|\\mstvx\\M 6 FAIL: gcc.target/powerpc/fold-vec-store-builtin_vec_xst-int.c scan-assembler-times \\mstxvw4x\\M|\\mstxvd2x\\M|\\mstxvx\\M|\\mstvx\\M 12 FAIL: gcc.target/powerpc/fold-vec-store-builtin_vec_xst-short.c scan-assembler-times \\mstxvw4x\\M|\\mstxvd2x\\M|\\mstxvx\\M|\\mstvx\\M 12 FAIL: gcc.target/powerpc/fold-vec-store-vec_vsx_st-char.c scan-assembler-times \\mstxvw4x\\M|\\mstxvd2x\\M|\\mstxvx\\M|\\mstvx\\M 12 FAIL: gc
[Bug target/95139] Messages using string concatenation can not be translated
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95139 Andrew Pinski changed: What|Removed |Added Keywords||diagnostic Ever confirmed|0 |1 Last reconfirmed||2021-04-21 Status|UNCONFIRMED |NEW --- Comment #2 from Andrew Pinski --- Confirmed.
[Bug c++/95226] [8 Regression] Faulty aggregate initialization of vector with struct with float
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95226 Andrew Pinski changed: What|Removed |Added Known to work||8.1.0, 9.1.0, 9.2.0, 9.3.0 Summary|Faulty aggregate|[8 Regression] Faulty |initialization of vector|aggregate initialization of |with struct with float |vector with struct with ||float Target Milestone|--- |8.5
[Bug jit/95415] Add support for thread-local variables
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95415 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement Ever confirmed|0 |1 Last reconfirmed||2021-04-21 Status|UNCONFIRMED |NEW --- Comment #2 from Andrew Pinski --- Confirmed.
[Bug c/95513] Bad warning about control reaches end of function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95513 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2021-04-21 Severity|normal |minor Status|UNCONFIRMED |NEW --- Comment #1 from Andrew Pinski --- Confirmed, though I think GCC warns about the "control reaches end of function" without much optimizations done so GCC does not know the switch statement is fully taken care of.
[Bug lto/95548] ice in tree_to_shwi, at tree.c:7321
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95548 Andrew Pinski changed: What|Removed |Added Keywords||ice-on-valid-code Target Milestone|--- |10.3 Resolution|--- |FIXED Status|UNCONFIRMED |RESOLVED --- Comment #10 from Andrew Pinski --- Fixed for a few months now.
[Bug target/94680] Missed optimization with __builtin_shuffle and zero vector
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94680 --- Comment #3 from Andrew Pinski --- Notice aarch64 should have a similar optimization and filed PR 100165 for that.
[Bug target/100165] New: fmov could be used to zero out the upper bits instead of movi/zip or movi/ins with __builtin_shuffle and zero vector
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100165 Bug ID: 100165 Summary: fmov could be used to zero out the upper bits instead of movi/zip or movi/ins with __builtin_shuffle and zero vector Product: gcc Version: 11.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: enhancement Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: pinskia at gcc dot gnu.org Target Milestone: --- Target: aarch64-*-* Take: typedef double V __attribute__((vector_size(16))); typedef long long VI __attribute__((vector_size(16))); V foo (V x) { return __builtin_shuffle (x, (V) { 0, 0, }, (VI) {0, 3}); } - CUT Or typedef float V __attribute__((vector_size(16))); typedef int VI __attribute__((vector_size(16))); V foo (V x) { return __builtin_shuffle (x, (V) { 0, 0, 0, 0 }, (VI) {0, 1, 4, 5}); } CUT Both should just produce: fmov d0, d0 ret CUT The x86_64 specific version of this was PR 94680 which I just confirmed today.
[Bug libstdc++/100164] [11 Regression] semaphore_impl not declared on AIX
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100164 --- Comment #9 from David Edelsohn --- The previous semaphore_base.h implementation had a fallback that hid a bug in the macros: #if defined _GLIBCXX_HAVE_LINUX_FUTEX && !_GLIBCXX_REQUIRE_POSIX_SEMAPHORE // Use futex if available and didn't force use of POSIX using __fast_semaphore = __atomic_semaphore<__detail::__platform_wait_t>; #elif _GLIBCXX_HAVE_POSIX_SEMAPHORE using __fast_semaphore = __platform_semaphore; #else using __fast_semaphore = __atomic_semaphore; #endif The problem is that libstdc++ configure defines _GLIBCXX_HAVE_POSIX_SEMAPHORE in config.h. libstdc++ uses sed to rewrite config.h to c++config.h and prepends _GLIBCXX_, so c++config.h contains #define _GLIBCXX__GLIBCXX_HAVE_POSIX_SEMAPHORE 1 And bits/semaphore_base.h is not testing that corrupted macro. Either semaphore_base.h needs to test for the corrupted macro, or libtsdc++ configure needs to define HAVE_POSIX_SEMAPHORE without itself prepending _GLIBCXX_ so that the c++config.h rewriting works correctly and defines the correct macro for semaphore_base.h.
[Bug target/94680] Missed optimization with __builtin_shuffle and zero vector
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94680 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Severity|normal |enhancement Keywords||missed-optimization Last reconfirmed||2021-04-21 --- Comment #2 from Andrew Pinski --- Confirmed.
[Bug libstdc++/100164] [11 Regression] semaphore_impl not declared on AIX
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100164 --- Comment #8 from David Edelsohn --- I am not certain why you cannot log in to the compile farm system. I am testing the patch on one of the AIX systems inside IBM.
[Bug target/100163] -falign-loops sometimes produces invalid code for SH-2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100163 Andrew Pinski changed: What|Removed |Added Resolution|--- |INVALID Status|WAITING |RESOLVED --- Comment #3 from Andrew Pinski --- Yes this is not a bug. You can't use .align in the data section and expect nops to happen. You marking a function in the data section makes this invalid. See https://sourceware.org/binutils/docs-2.36/as/Align.html#Align "However, on most systems, if the section is marked as containing code and the fill value is omitted, the space is filled with no-op instructions." GCC is assuming you are using a section which is marked as containing code which .data is not one of those sections.
[Bug sanitizer/100114] libasan built against latest glibc doesn't work
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100114 --- Comment #9 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:3d0135bf3be416bbe2531dc763d19b749eb2b856 commit r9-9450-g3d0135bf3be416bbe2531dc763d19b749eb2b856 Author: Jakub Jelinek Date: Sat Apr 17 11:27:14 2021 +0200 sanitizer: Fix asan against glibc 2.34 [PR100114] As mentioned in the PR, SIGSTKSZ is no longer a compile time constant in glibc 2.34 and later, so static const uptr kAltStackSize = SIGSTKSZ * 4; needs dynamic initialization, but is used by a function called indirectly from .preinit_array and therefore before the variable is constructed. This results in using 0 size instead and all asan instrumented programs die with: ==91==ERROR: AddressSanitizer failed to allocate 0x0 (0) bytes of SetAlternateSignalStack (error code: 22) Here is a cherry-pick from upstream to fix this. 2021-04-17 Jakub Jelinek PR sanitizer/100114 * sanitizer_common/sanitizer_posix_libcdep.cc: Cherry-pick llvm-project revisions 82150606fb11d28813ae6da1101f5bda638165fe and b93629dd335ffee2fc4b9b619bf86c3f9e6b0023. (cherry picked from commit 950bac27d63c1c2ac3a6ed867692d6a13f21feb3)
[Bug jit/100096] libgccjit.so.0: Cannot write-enable text segment: Permission denied on NetBSD 9.1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100096 --- Comment #26 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:e173d85243b7732aa3ef29ebf7ecc6a54d21320c commit r9-9449-ge173d85243b7732aa3ef29ebf7ecc6a54d21320c Author: Jakub Jelinek Date: Fri Apr 16 18:32:27 2021 +0200 intl: Add --enable-host-shared support [PR100096] As mentioned in the PR, building gcc with jit enabled and --enable-host-shared doesn't work on NetBSD/i?86, as libgccjit.so.0 has text relocations. The r0-125846-g459260ecf8b420b029601a664cdb21c185268ecb changes added --enable-host-shared support to various libraries, but didn't add it to intl/ subdirectory; on Linux it isn't really needed, because all: all-no all-no: #nothing but on other OSes intl/libintl.a is built. The following patch makes sure it is built with -fPIC when --enable-host-shared is used. 2021-04-16 Jakub Jelinek PR jit/100096 * configure.ac: Add --enable-host-shared support. * Makefile.in: Update copyright. Add @PICFLAG@ to CFLAGS. * configure: Regenerated. (cherry picked from commit a11f31102706e33f66b60367d6863613ab3bd051)
[Bug target/99767] [9 Regression] ICE in expand_direct_optab_fn, at internal-fn.c:3360
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99767 --- Comment #11 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:7cbe3b2fa21524dd2a3ba6d104f231ff88821622 commit r9-9448-g7cbe3b2fa21524dd2a3ba6d104f231ff88821622 Author: Jakub Jelinek Date: Fri Apr 16 11:44:04 2021 +0200 vectorizer: Remove dead scalar .COND_* calls from vectorized loops [PR99767] The following testcase ICEs because disabling of DCE means there are dead stmts in the loop (though, in theory they could become dead only shortly before if-conv through some optimization), ifcvt which goes through all stmts in the loop if-converts them into .COND_DIV etc. internal fn calls in the copy of the loop meant for vectorization only, the loop is successfully vectorized but the particular .COND_* call is not because it isn't a live statement and the scalar .COND_* remains in the IL until expansion where it ICEs because these ifns only support vectors and not scalars. These ifns are similar to .MASK_{LOAD,STORE} in this behavior. One possible fix could be to expand scalar versions of them during expansion, basically undoing what if-conv did to create them, i.e. expand them as the lhs = else; if (mask) { lhs = statement; } or so. For .MASK_LOAD we have code to replace them in vect_transform_loop already though (not needed for .MASK_STORE, as stores should be always live and thus always vectorized), so this patch instead replaces .COND_* similarly to .MASK_LOAD in that loop, with the small difference that lhs = .MASK_LOAD (...); is replaced by lhs = 0; while lhs = .COND_* (..., else_arg); is replaced by lhs = else_arg. The statement must be dead, otherwise it would be vectorized, so I think it is not a big deal we don't turn it back into multiple basic blocks etc. (and it might be not possible to do that at that point). 2021-04-16 Jakub Jelinek PR target/99767 * tree-vect-loop.c (vect_transform_loop): Don't remove just dead scalar .MASK_LOAD calls, but also dead .COND_* calls - replace them by their last argument. * gcc.target/aarch64/pr99767.c: New test. (cherry picked from commit 1730b5d6793127b1a47970f44d60da8082bab514)
[Bug c++/99833] [8/9 Regression] structured binding + if init + generic lambda = internal compiler error
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99833 --- Comment #15 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:96f7b35903338738c7e7e4d99aa929276404 commit r9-9447-g96f7b35903338738c7e7e4d99aa929276404 Author: Jakub Jelinek Date: Fri Apr 16 09:32:44 2021 +0200 c++: Fix up handling of structured bindings in extract_locals_r [PR99833] The following testcase ICEs in tsubst_decomp_names because the assumptions that the structured binding artificial var is followed in DECL_CHAIN by the corresponding structured binding vars is violated. I've tracked it to extract_locals* which is done for the constexpr IF_STMT. extract_locals_r when it sees a DECL_EXPR adds that decl into a hash set so that such decls aren't returned from extract_locals*, but in the case of a structured binding that just means the artificial var and not the vars corresponding to structured binding identifiers. The following patch fixes it by pushing not just the artificial var for structured bindings but also the other vars. 2021-04-16 Jakub Jelinek PR c++/99833 * pt.c (extract_locals_r): When handling DECL_EXPR of a structured binding, add to data.internal also all corresponding structured binding decls. * g++.dg/cpp1z/pr99833.C: New test. (cherry picked from commit 06d50ebc9fb2761ed2bdda5e76adb4d47a8ca983)
[Bug rtl-optimization/99905] [8/9 Regression] wrong code with -mno-mmx -mno-sse since r7-4540-gb229ab2a712ccd44
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99905 --- Comment #13 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:00fc4d9fff1f38c205b52e0723eb3b03989ee292 commit r9-9446-g00fc4d9fff1f38c205b52e0723eb3b03989ee292 Author: Jakub Jelinek Date: Tue Apr 13 01:01:45 2021 +0200 combine: Fix up expand_compound_operation [PR99905] The following testcase is miscompiled on x86_64-linux. expand_compound_operation is called on (zero_extract:DI (mem/c:TI (reg/f:DI 16 argp) [3 i+0 S16 A128]) (const_int 16 [0x10]) (const_int 63 [0x3f])) so mode is DImode, inner_mode is TImode, pos 63, len 16 and modewidth 64. A couple of lines above the problematic spot we have: if (modewidth >= pos + len) { tem = gen_lowpart (mode, XEXP (x, 0)); where the code uses gen_lowpart and then shift left/right to extract it in mode. But the guarding condition is false - 64 >= 63 + 16 and so we enter the next condition, where the code shifts XEXP (x, 0) right by pos and then adds AND. It does so incorrectly though. Given the modewidth < pos + len, inner_mode must be necessarily larger than mode and XEXP (x, 0) has the innermode, but it was calling simplify_shift_const with mode rather than inner_mode, which meant inconsistent arguments to simplify_shift_const and in this case made a DImode MEM shift out of it. The following patch fixes it, by doing the shift in inner_mode properly and then after the shift doing the lowpart subreg and masking already in mode. 2021-04-13 Jakub Jelinek PR rtl-optimization/99905 * combine.c (expand_compound_operation): If pos + len > modewidth, perform the right shift by pos in inner_mode and then convert to mode, instead of trying to simplify a shift of rtx with inner_mode by pos as if it was a shift in mode. * gcc.target/i386/pr99905.c: New test. (cherry picked from commit c965254e5af9dc68444e0289250c393ae0cd6131)
[Bug debug/99830] [11 Regression] ICE: in lra_eliminate_regs_1, at lra-eliminations.c:659 with -O2 -fno-expensive-optimizations -fno-split-wide-types -g
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99830 --- Comment #18 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:6bb1dccf0defbcf245b0804211e20be8624c1f40 commit r9-9445-g6bb1dccf0defbcf245b0804211e20be8624c1f40 Author: Jakub Jelinek Date: Tue Apr 13 01:00:48 2021 +0200 combine: Don't fold away side-effects in simplify_and_const_int_1 [PR99830] Here is an alternate patch for the PR99830 bug. As discussed on IRC and in the PR, the reason why a (clobber:TI (const_int 0)) has been propagated into the debug insns is that it got optimized away during simplification from the i3 instruction pattern. And that happened because simplify_and_const_int_1 (SImode, varop, 255) with varop of (ashift:SI (subreg:SI (and:TI (clobber:TI (const_int 0 [0])) (const_int 255 [0xff])) 0) (const_int 16 [0x10])) was called and through nonzero_bits determined that (whatever << 16) & 255 is const0_rtx. It is, but if there are side-effects in varop and such clobbers are considered as such, we shouldn't optimize those away. 2021-04-13 Jakub Jelinek PR debug/99830 * combine.c (simplify_and_const_int_1): Don't optimize varop away if it has side-effects. * gcc.dg/pr99830.c: New test. (cherry picked from commit 4ac7483ede91fef7cfd548ff6e30e46eeb9d95ae)
[Bug c/99990] [8/9 Regression] ICE in gimplifier on invalid va_arg
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=0 --- Comment #10 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:9861f00a08a5f5fecd2c1c4135d3d540b0ed9cc7 commit r9-9444-g9861f00a08a5f5fecd2c1c4135d3d540b0ed9cc7 Author: Jakub Jelinek Date: Sat Apr 10 17:01:54 2021 +0200 c: Avoid clobbering TREE_TYPE (error_mark_node) [PR0] The following testcase ICEs during error recovery, because finish_decl overwrites TREE_TYPE (error_mark_node), which better should stay always to be error_mark_node. 2021-04-10 Jakub Jelinek PR c/0 * c-decl.c (finish_decl): Don't overwrite TREE_TYPE of error_mark_node. * gcc.dg/pr0.c: New test. (cherry picked from commit 91e076f3a66c1c9f6aa51e9d53d07803606e3bf1)
[Bug lto/99849] [8/9 Regression] ICE in expand_expr_real_1, at expr.c:11556 since r5-5407-g30d5d8c5189064c8
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99849 --- Comment #8 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:49a7e7d0fc5fcc28ed13b6d67faf99a5dfe03f65 commit r9-9443-g49a7e7d0fc5fcc28ed13b6d67faf99a5dfe03f65 Author: Jakub Jelinek Date: Sat Apr 10 12:49:01 2021 +0200 expand: Fix up LTO ICE with COMPOUND_LITERAL_EXPR [PR99849] The gimplifier optimizes away COMPOUND_LITERAL_EXPRs, but they can remain in the form of ADDR_EXPR of COMPOUND_LITERAL_EXPRs in static initializers. By the TREE_STATIC check I meant to check that the underlying decl of the compound literal is a global rather than automatic variable which obviously can't be referenced in static initializers, but unfortunately with LTO it might end up in another partition and thus be DECL_EXTERNAL instead. 2021-04-10 Jakub Jelinek PR lto/99849 * expr.c (expand_expr_addr_expr_1): Test is_global_var rather than just TREE_STATIC on COMPOUND_LITERAL_EXPR_DECLs. * gcc.dg/lto/pr99849_0.c: New test. (cherry picked from commit 2e57bc7eedb084869d17fe07b538d907b8fee819)
[Bug rtl-optimization/98601] [8/9 Regression] aarch64: ICE in rtx_addr_can_trap_p_1, at rtlanal.c:467
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98601 --- Comment #11 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:86e761b46de55532db35f257ea67071512804a58 commit r9-9442-g86e761b46de55532db35f257ea67071512804a58 Author: Jakub Jelinek Date: Sat Apr 10 12:46:09 2021 +0200 rtlanal: Another fix for VOIDmode MEMs [PR98601] This is a sequel to the PR85022 changes, inline-asm can (unfortunately) introduce VOIDmode MEMs and in PR85022 they have been changed so that we don't pretend we know their size (as opposed to assuming they have zero size). This time we ICE in rtx_addr_can_trap_p_1 because it assumes that all memory but BLKmode has known size. The patch just treats VOIDmode MEMs like BLKmode in that regard. And, the STRICT_ALIGNMENT change is needed because VOIDmode has GET_MODE_SIZE of 0 and we don't want to check if something is a multiple of 0. 2021-04-10 Jakub Jelinek PR rtl-optimization/98601 * rtlanal.c (rtx_addr_can_trap_p_1): Allow in assert unknown size not just for BLKmode, but also for VOIDmode. For STRICT_ALIGNMENT unaligned_mems handle VOIDmode like BLKmode. * gcc.dg/torture/pr98601.c: New test. (cherry picked from commit e68ac8c2b46997af1464f2549ac520a192c928b1)
[Bug rtl-optimization/99863] [10 Regression] wrong code with -O -fno-tree-forwprop -mno-sse2 since r10-7268-g529ea7d9596b26ba
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99863 --- Comment #23 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:e9d56fab891afa3d40ed2ad75264c09aee65ab88 commit r9-9441-ge9d56fab891afa3d40ed2ad75264c09aee65ab88 Author: Jakub Jelinek Date: Sat Apr 3 10:07:09 2021 +0200 dse: Fix up hard reg conflict checking in replace_read [PR99863] Since PR37922 fix RTL DSE has hard register conflict checking in replace_read, so that if the replacement sequence sets (or typically just clobbers) some hard register (usually condition codes) we verify that hard register is not live. Unfortunately, it compares the hard reg set clobbered/set by the sequence (regs_set) against the currently live hard register set, but it then emits the insn sequence not at the current insn position, but before store_insn->insn. So, we should not compare against the current live hard register set, but against the hard register live set at the point of the store insn. Fortunately, we already have that remembered in store_insn->fixed_regs_live. In addition to bootstrapping/regtesting this patch on x86_64-linux and i686-linux, I've also added statistics gathering and it seems the only place where we end up rejecting the replace_read is the newly added testcase (the PR37922 is no longer effective at that) and fixed_regs_live has been always non-NULL at the if (store_insn->fixed_regs_live) spot. Rather than having there an assert, I chose to just keep regs_set as is, which means in that hypothetical case where fixed_regs_live wouldn't be computed for some store we'd still accept sequences that don't clobber/set any hard registers and just punt on those that clobber/set those. 2021-04-03 Jakub Jelinek PR rtl-optimization/99863 * dse.c (replace_read): Drop regs_live argument. Instead of regs_live, use store_insn->fixed_regs_live if non-NULL, otherwise punt if insns sequence clobbers or sets any hard registers. * gcc.target/i386/pr99863.c: New test. (cherry picked from commit 7a2f91d413eb7a3eb0ba52c7ac9618a35addd12a)
[Bug c++/99790] [8/9 Regression] internal compiler error: in expand_expr_real_2 since r7-3811
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99790 --- Comment #9 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:41c21d0e51e82c8303a8ca03c69546f86caa1b92 commit r9-9440-g41c21d0e51e82c8303a8ca03c69546f86caa1b92 Author: Jakub Jelinek Date: Tue Mar 30 18:15:32 2021 +0200 c++: Fix ICE on PTRMEM_CST in lambda in inline var initializer [PR99790] The following testcase ICEs (since the addition of inline var support), because the lambda contains PTRMEM_CST but finish_function is called for the lambda quite early during parsing it (from finish_lambda_function) when the containing class is still incomplete. That means that during genericization cplus_expand_constant keeps the PTRMEM_CST unmodified, but later nothing lowers it when the class is finalized. Using sizeof etc. on the class in such contexts is rejected by both g++ and clang++, and when the PTRMEM_CST appears e.g. in static var initializers rather than in functions, we handle it correctly because c_parse_final_cleanups -> lower_var_init will handle those cplus_expand_constant when all classes are already finalized. The following patch fixes it by calling cplus_expand_constant again during gimplification, as we are now unconditionally unit at a time, I'd think everything that could be completed will be before we start gimplification. 2021-03-30 Jakub Jelinek PR c++/99790 * cp-gimplify.c (cp_gimplify_expr): Handle PTRMEM_CST. * g++.dg/cpp1z/pr99790.C: New test. (cherry picked from commit 7cdd30b43a63832d6f908b2dd64bd19a0817cd7b)
[Bug tree-optimization/99777] [11 Regression] ICE in build2, at tree.c:4869 with -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99777 --- Comment #7 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:a784483132fd7f3830b96e5a606d8eeb8f64e5ce commit r9-9439-ga784483132fd7f3830b96e5a606d8eeb8f64e5ce Author: Jakub Jelinek Date: Mon Mar 29 12:35:32 2021 +0200 fold-const: Fix ICE in extract_muldiv_1 [PR99777] extract_muldiv{,_1} is apparently only prepared to handle scalar integer operations, the callers ensure it by only calling it if the divisor or one of the multiplicands is INTEGER_CST and because neither multiplication nor division nor modulo are really supported e.g. for pointer types, nullptr type etc. But the CASE_CONVERT handling doesn't really check if it isn't a cast from some other type kind, so on the testcase we end up trying to build MULT_EXPR in POINTER_TYPE which ICEs. A few years ago Marek has added ANY_INTEGRAL_TYPE_P checks to two spots, but the code uses TYPE_PRECISION which means something completely different for vector types, etc. So IMNSHO we should just punt on conversions from non-integrals or non-scalar integrals. 2021-03-29 Jakub Jelinek PR tree-optimization/99777 * fold-const.c (extract_muldiv_1): For conversions, punt on casts from types other than scalar integral types. * g++.dg/torture/pr99777.C: New test. (cherry picked from commit afe9a630eae114665e77402ea083201c9d406e99)
[Bug debug/99334] Generated DWARF unwind table issue while on instructions where rbp is pointing to callers stack frame
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99334 --- Comment #12 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:c1780b4c6f1c0e91f10d13e70eb17f5d77f22bb0 commit r9-9438-gc1780b4c6f1c0e91f10d13e70eb17f5d77f22bb0 Author: Jakub Jelinek Date: Sat Mar 27 00:20:42 2021 +0100 dwarf2cfi: Defer queued register saves some more [PR99334] On the testcase in the PR with -fno-tree-sink -O3 -fPIC -fomit-frame-pointer -fno-strict-aliasing -mstackrealign we have prologue: <_func_with_dwarf_issue_>: 0: 4c 8d 54 24 08 lea0x8(%rsp),%r10 5: 48 83 e4 f0 and$0xfff0,%rsp 9: 41 ff 72 f8 pushq -0x8(%r10) d: 55 push %rbp e: 48 89 e5mov%rsp,%rbp 11: 41 57 push %r15 13: 41 56 push %r14 15: 41 55 push %r13 17: 41 54 push %r12 19: 41 52 push %r10 1b: 53 push %rbx 1c: 48 83 ec 20 sub$0x20,%rsp and emit 0014 CIE Version: 1 Augmentation: "zR" Code alignment factor: 1 Data alignment factor: -8 Return address column: 16 Augmentation data: 1b DW_CFA_def_cfa: r7 (rsp) ofs 8 DW_CFA_offset: r16 (rip) at cfa-8 DW_CFA_nop DW_CFA_nop 0018 0044 001c FDE cie= pc=..01d5 DW_CFA_advance_loc: 5 to 0005 DW_CFA_def_cfa: r10 (r10) ofs 0 DW_CFA_advance_loc: 9 to 000e DW_CFA_expression: r6 (rbp) (DW_OP_breg6 (rbp): 0) DW_CFA_advance_loc: 13 to 001b DW_CFA_def_cfa_expression (DW_OP_breg6 (rbp): -40; DW_OP_deref) DW_CFA_expression: r15 (r15) (DW_OP_breg6 (rbp): -8) DW_CFA_expression: r14 (r14) (DW_OP_breg6 (rbp): -16) DW_CFA_expression: r13 (r13) (DW_OP_breg6 (rbp): -24) DW_CFA_expression: r12 (r12) (DW_OP_breg6 (rbp): -32) ... unwind info for that. The problem is when async signal (or stepping through in the debugger) stops after the pushq %rbp instruction and before movq %rsp, %rbp, the unwind info says that caller's %rbp is saved there at *%rbp, but that is not true, caller's %rbp is either still available in the %rbp register, or in *%rsp, only after executing the next instruction - movq %rsp, %rbp - the location for %rbp is correct. So, either we'd need to temporarily say: DW_CFA_advance_loc: 9 to 000e DW_CFA_expression: r6 (rbp) (DW_OP_breg7 (rsp): 0) DW_CFA_advance_loc: 3 to 0011 DW_CFA_expression: r6 (rbp) (DW_OP_breg6 (rbp): 0) DW_CFA_advance_loc: 10 to 001b or to me it seems more compact to just say: DW_CFA_advance_loc: 12 to 0011 DW_CFA_expression: r6 (rbp) (DW_OP_breg6 (rbp): 0) DW_CFA_advance_loc: 10 to 001b I've tried instead to deal with it through REG_FRAME_RELATED_EXPR from the backend, but that failed miserably as explained in the PR, dwarf2cfi.c has some rules (Rule 16 to Rule 19) that are specific to the dynamic stack realignment using drap register that only the i386 backend does right now, and by using REG_FRAME_RELATED_EXPR or REG_CFA* notes we can't emulate those rules. The following patch instead does the deferring of the hard frame pointer save rule in dwarf2cfi.c Rule 18 handling and emits it on the (set hfp sp) assignment that must appear shortly after it and adds assertion that it is the case. The difference before/after the patch on the assembly is: --- pr99334.s~ 2021-03-26 15:42:40.881749380 +0100 +++ pr99334.s 2021-03-26 17:38:05.729161910 +0100 @@ -11,8 +11,8 @@ _func_with_dwarf_issue_: andq$-16, %rsp pushq -8(%r10) pushq %rbp - .cfi_escape 0x10,0x6,0x2,0x76,0 movq%rsp, %rbp + .cfi_escape 0x10,0x6,0x2,0x76,0 pushq %r15 pushq %r14 pushq %r13 i.e. does just what we IMHO need, after pushq %rbp %rbp still contains parent's frame value and so the save rule doesn't need to be overridden there, ditto at the start of the next insn before the side-effect took effect, and we override it only after it when %rbp already has the right value. If some other target adds dynamic stack realignment in the future and the offset 0 case wouldn't be true there, the code can be adjusted so that it works on all the drap architectures, I'm pretty sure the code would need other adjustments too. For the rule 18 and for the (set hfp sp) after it we already have asserts for the
[Bug c++/99745] ICE when parameter pack not expanded in bit field
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99745 --- Comment #6 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:4b30a6d214ad94dfdffa93843d628484d2555b51 commit r9-9437-g4b30a6d214ad94dfdffa93843d628484d2555b51 Author: Jakub Jelinek Date: Thu Mar 25 21:06:09 2021 +0100 c++: Diagnose bare parameter packs in bitfield widths [PR99745] The following invalid tests ICE because we don't diagnose (and drop) bare parameter packs in bitfield widths. 2021-03-25 Jakub Jelinek PR c++/99745 * decl2.c (grokbitfield): Diagnose bitfields containing bare parameter packs and don't set DECL_BIT_FIELD_REPRESENTATIVE in that case. * g++.dg/cpp0x/variadic181.C: New test. (cherry picked from commit f8780caf07340f5d5e55cf5fb1b2be07cabab1ea)
[Bug c++/99650] ICE when trying to form reference to void in structured binding
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99650 --- Comment #7 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:dd320787b4eb11521e3ae3f9aa9504b31ee08c36 commit r9-9436-gdd320787b4eb11521e3ae3f9aa9504b31ee08c36 Author: Jakub Jelinek Date: Tue Mar 23 10:23:42 2021 +0100 c++: Diagnose references to void in structured bindings [PR99650] We ICE on the following testcase, because std::tuple_element<...,...>::type is void and for structured bindings we therefore need to create void & or void && which is invalid. We created such REFERENCE_TYPE and later ICEd in the middle-end. The following patch fixes it by diagnosing that. 2021-03-23 Jakub Jelinek PR c++/99650 * decl.c (cp_finish_decomp): Diagnose void initializers when using tuple_element and get. * g++.dg/cpp1z/decomp55.C: New test. (cherry picked from commit d5e379e3fe19362442b5d0ac608fb8ddf67fecd3)
[Bug debug/99388] Invalid debug info for __fp16
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99388 --- Comment #5 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:ce1bf41ff68a58512424daa473bea3e77de3eff4 commit r9-9435-gce1bf41ff68a58512424daa473bea3e77de3eff4 Author: Jakub Jelinek Date: Sun Mar 21 17:27:39 2021 +0100 dwarf2out: Fix debug info for 2 byte floats [PR99388] Aarch64, ARM and a couple of other architectures have 16-bit floats, HFmode. As can be seen e.g. on void foo (void) { __fp16 a = 1.0; asm ("nop"); a = 2.0; asm ("nop"); a = 3.0; asm ("nop"); } testcase, GCC mishandles this on the dwarf2out.c side by assuming all floating point types have sizes in multiples of 4 bytes, so what GCC emits is it says that e.g. the DW_OP_implicit_value will be 2 bytes but then doesn't emit anything and so anything emitted after it is treated by consumers as the value and then they get out of sync. real_to_target which insert_float uses indeed fills it that way, but putting into an array of long 32 bits each time, but for the half floats it puts everything into the least significant 16 bits of the first long no matter what endianity host or target has. The following patch fixes it. With the patch the -g -O2 -dA output changes (in a cross without .uleb128 support): .byte 0x9e// DW_OP_implicit_value .byte 0x2 // uleb128 0x2 + .2byte 0x3c00 // fp or vector constant word 0 .byte 0x7 // DW_LLE_start_end (*.LLST0) .8byte .LVL1 // Location list begin address (*.LLST0) .8byte .LVL2 // Location list end address (*.LLST0) .byte 0x4 // uleb128 0x4; Location expression size .byte 0x9e// DW_OP_implicit_value .byte 0x2 // uleb128 0x2 + .2byte 0x4000 // fp or vector constant word 0 .byte 0x7 // DW_LLE_start_end (*.LLST0) .8byte .LVL2 // Location list begin address (*.LLST0) .8byte .LFE0 // Location list end address (*.LLST0) .byte 0x4 // uleb128 0x4; Location expression size .byte 0x9e// DW_OP_implicit_value .byte 0x2 // uleb128 0x2 + .2byte 0x4200 // fp or vector constant word 0 .byte 0 // DW_LLE_end_of_list (*.LLST0) Bootstrapped/regtested on x86_64-linux, aarch64-linux and armv7hl-linux-gnueabi, ok for trunk? I fear the CONST_VECTOR case is still broken, while HFmode elements of vectors should be fine (it uses eltsize of the element sizes) and likewise SFmode could be fine, DFmode vectors are emitted as two 32-bit ints regardless of endianity and I'm afraid it can't be right on big-endian. But I haven't been able to create a testcase that emits a CONST_VECTOR, for e.g. unused vector vars with constant operands we emit CONCATN during expansion and thus ... DW_OP_*piece for each element of the vector and for DW_TAG_call_site_parameter we give up (because we handle CONST_VECTOR only in loc_descriptor, not mem_loc_descriptor). 2021-03-21 Jakub Jelinek PR debug/99388 * dwarf2out.c (insert_float): Change return type from void to unsigned, handle GET_MODE_SIZE (mode) == 2 and return element size. (mem_loc_descriptor, loc_descriptor, add_const_value_attribute): Adjust callers. (cherry picked from commit d3dd3703f1d42b14c88b91e51a2a775fe00a2974)
[Bug c/99588] variable set but not used warning on static _Atomic assignment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99588 --- Comment #9 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:803a95e2a0134105dd259d7ccd258744e94c3233 commit r9-9434-g803a95e2a0134105dd259d7ccd258744e94c3233 Author: Jakub Jelinek Date: Fri Mar 19 22:54:31 2021 +0100 c: Fix up -Wunused-but-set-* warnings for _Atomics [PR99588] As the following testcases show, compared to -D_Atomic= case we have many -Wunused-but-set-* warning false positives. When an _Atomic variable/parameter is read, we call mark_exp_read on it in convert_lvalue_to_rvalue, but build_atomic_assign does not. For consistency with the non-_Atomic case where we mark_exp_read the lhs for lhs op= ... but not for lhs = ..., this patch does that too. But furthermore we need to pattern match the trees emitted by _Atomic store, so that _Atomic store itself is not marked as being a variable read, but when the result of the store is used, we mark it. 2021-03-19 Jakub Jelinek PR c/99588 * c-typeck.c (mark_exp_read): Recognize what build_atomic_assign with modifycode NOP_EXPR produces and mark the _Atomic var as read if found. (build_atomic_assign): For modifycode of NOP_EXPR, use COMPOUND_EXPRs rather than STATEMENT_LIST. Otherwise call mark_exp_read on lhs. Set TREE_SIDE_EFFECTS on the TARGET_EXPR. * gcc.dg/Wunused-var-5.c: New test. * gcc.dg/Wunused-var-6.c: New test. (cherry picked from commit b1fc1f1c4b2e9005c40ed476b067577da2d2ce84)
[Bug target/99542] [9 Regression] ICE in exact_div, at poly-int.h:2219
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99542 --- Comment #12 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:4eb2e3eb0f4a28fade00c1dca626cc947d20e7c4 commit r9-9433-g4eb2e3eb0f4a28fade00c1dca626cc947d20e7c4 Author: Christophe Lyon Date: Tue Mar 16 21:48:10 2021 + aarch64: Fix up aarch64_simd_clone_compute_vecsize_and_simdlen [PR99542] The gcc.dg/declare-simd.c test does not emit a warning with -mabi=ilp32. 2021-03-16 Christophe Lyon PR target/99542 gcc/testsuite/ * gcc.dg/declare-simd.c (fn2): Expect a warning only under lp64. (cherry picked from commit d6300df5f2b9fafa07be4f974fef1ed810d0e7fd)
[Bug c++/99613] Static variable destruction order race condition
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99613 --- Comment #20 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:cb6efb9d909323e41fa0a8abbe805eb32d1659ea commit r9-9432-gcb6efb9d909323e41fa0a8abbe805eb32d1659ea Author: Jakub Jelinek Date: Tue Mar 16 21:17:44 2021 +0100 c++: Ensure correct destruction order of local statics [PR99613] As mentioned in the PR, if end of two constructions of local statics is strongly ordered, their destructors should be run in the reverse order. As we run __cxa_guard_release before calling __cxa_atexit, it is possible that we have two threads that access two local statics in the same order for the first time, one thread wins the __cxa_guard_acquire on the first one but is rescheduled in between the __cxa_guard_release and __cxa_atexit calls, then the other thread is scheduled and wins __cxa_guard_acquire on the second one and calls __cxa_quard_release and __cxa_atexit and only afterwards the first thread calls its __cxa_atexit. This means a variable whose completion of the constructor strongly happened after the completion of the other one will be destructed after the other variable is destructed. The following patch fixes that by swapping the __cxa_guard_release and __cxa_atexit calls. 2021-03-16 Jakub Jelinek PR c++/99613 * decl.c (expand_static_init): For thread guards, call __cxa_atexit before calling __cxa_guard_release rather than after it. Formatting fixes. (cherry picked from commit 1703937a05b8b95bc29d2de292387dfd9eb7c9a3)
[Bug target/99542] [9 Regression] ICE in exact_div, at poly-int.h:2219
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99542 --- Comment #11 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:0fe231e4dde882cd56029e122abc4fa940ad4ac5 commit r9-9431-g0fe231e4dde882cd56029e122abc4fa940ad4ac5 Author: Jakub Jelinek Date: Tue Mar 16 10:34:44 2021 +0100 aarch64: Fix up aarch64_simd_clone_compute_vecsize_and_simdlen [PR99542] As the patch shows, there are several bugs in aarch64_simd_clone_compute_vecsize_and_simdlen. One is that unlike for function declarations that aren't definitions it completely ignores argument types. Such decls don't have DECL_ARGUMENTS, but we can walk TYPE_ARG_TYPES instead, like the i386 backend does or like the simd cloning code in the middle end does too. Another problem is that it checks types of uniform arguments. That is unnecessary, uniform arguments are passed the way it normally is, it is a scalar argument rather than vector, so there is no reason not to support uniform argument of different size, or long double, structure etc. 2021-03-16 Jakub Jelinek PR target/99542 * config/aarch64/aarch64.c (aarch64_simd_clone_compute_vecsize_and_simdlen): If not a function definition, walk TYPE_ARG_TYPES list if non-NULL for argument types instead of DECL_ARGUMENTS. Ignore types for uniform arguments. * gcc.dg/gomp/pr99542.c: New test. * gcc.dg/gomp/pr59669-2.c (bar): Don't expect a warning on aarch64. * gcc.dg/gomp/simd-clones-2.c (setArray): Likewise. * g++.dg/vect/simd-clone-7.cc (bar): Likewise. * g++.dg/gomp/declare-simd-1.C (f37): Expect a different warning on aarch64. * gcc.dg/declare-simd.c (fn2): Expect a new warning on aarch64. (cherry picked from commit 06589d2232abc92ac9bcb43e4a4ec64ead627752)
[Bug middle-end/93235] [AArch64] ICE with __fp16 in a struct
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93235 --- Comment #10 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:5d47377411652b67e258d81fb9e29a603556fd16 commit r9-9430-g5d47377411652b67e258d81fb9e29a603556fd16 Author: Jakub Jelinek Date: Thu Mar 4 19:38:08 2021 +0100 expand: Fix ICE in store_bit_field_using_insv [PR93235] The following testcase ICEs on aarch64. The problem is that op0 is (subreg:HI (reg:HF ...) 0) and because we can't create a SUBREG of a SUBREG and aarch64 doesn't have HImode insv, only SImode insv, store_bit_field_using_insv tries to create (subreg:SI (reg:HF ...) 0) which is not valid for the target and so gen_rtx_SUBREG ICEs. The following patch fixes it by punting if the to be created SUBREG doesn't validate, callers of store_bit_field_using_insv can handle the fallback. 2021-03-04 Jakub Jelinek PR middle-end/93235 * expmed.c (store_bit_field_using_insv): Return false of xop0 is a SUBREG and a SUBREG to op_mode can't be created. * gcc.target/aarch64/pr93235.c: New test. (cherry picked from commit 510ff5def87c70836fdbf832228661ae28e524b6)
[Bug c++/82959] g++ doesn't appreciate C++17 evaluation order rules for overloaded operators
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82959 --- Comment #8 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:6c085d6d783f38f008ea54f80b43f6b8e8f6b971 commit r9-9429-g6c085d6d783f38f008ea54f80b43f6b8e8f6b971 Author: Jakub Jelinek Date: Wed Mar 3 16:12:23 2021 +0100 c++: Fix -fstrong-eval-order for operator &&, || and , [PR82959] P0145R3 added "However, the operands are sequenced in the order prescribed for the built-in operator" rule for overloaded operator calls when using the operator syntax. op_is_ordered follows that, but added just the overloaded operators added in that paper. &&, || and comma operators had rules that lhs is sequenced before rhs already in C++98. The following patch adds those cases to op_is_ordered. 2021-03-03 Jakub Jelinek PR c++/82959 * call.c (op_is_ordered): Handle TRUTH_ANDIF_EXPR, TRUTH_ORIF_EXPR and COMPOUND_EXPR. * g++.dg/cpp1z/eval-order10.C: New test. (cherry picked from commit 529e3b3402bd2a97b02318bd834df72815be5f0f)
[Bug c/99324] [8/9 Regression] ICE in mark_addressable, at gimple-expr.c:918 since r6-314
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99324 --- Comment #10 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:12cd8e1b690a22c478f91671dd7cf5f6cf8332a7 commit r9-9428-g12cd8e1b690a22c478f91671dd7cf5f6cf8332a7 Author: Jakub Jelinek Date: Wed Mar 3 09:55:19 2021 +0100 c-family: Avoid ICE on va_arg [PR99324] build_va_arg calls the middle-end mark_addressable, which e.g. requires that cfun is non-NULL. The following patch calls instead c_common_mark_addressable_vec which is the c-family variant similarly to the FE c_mark_addressable and cxx_mark_addressable, except that it doesn't error on addresses of register variables. As the taking of the address is artificial for the .VA_ARG ifn and when that is lowered goes away, it is similar case to the vector subscripting for which c_common_mark_addressable_vec has been added. 2021-03-03 Jakub Jelinek PR c/99324 * c-common.c (build_va_arg): Call c_common_mark_addressable_vec instead of mark_addressable. Fix a comment typo - neutrallly -> neutrally. * gcc.c-torture/compile/pr99324.c: New test. (cherry picked from commit 0e87dc86eb56f732a41af2590f0b807031003fbe)
[Bug c++/95451] [8/9 regression] ICE for lambda capturing this and calling operator() since r8-2720-gf44a8dd56f5bfbd0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95451 --- Comment #13 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:1f7f4e1118ab7f1118a2977303b672ecec439a6b commit r9-9427-g1f7f4e1118ab7f1118a2977303b672ecec439a6b Author: Jakub Jelinek Date: Fri Feb 26 10:43:28 2021 +0100 c++: Fix operator() lookup in lambdas [PR95451] During name lookup, name-lookup.c uses: if (!(!iter->type && HIDDEN_TYPE_BINDING_P (iter)) && (bool (want & LOOK_want::HIDDEN_LAMBDA) || !is_lambda_ignored_entity (iter->value)) && qualify_lookup (iter->value, want)) binding = iter->value; Unfortunately as the following testcase shows, this doesn't work in generic lambdas, where we on the auto b = ... lambda ICE and on the auto d = lambda reject it even when it should be valid. The problem is that the binding doesn't have a FUNCTION_DECL with LAMBDA_FUNCTION_P for the operator(), but an OVERLOAD with TEMPLATE_DECL for such FUNCTION_DECL. The following patch fixes that in is_lambda_ignored_entity, other possibility would be to do that before calling is_lambda_ignored_entity in name-lookup.c. 2021-02-26 Jakub Jelinek PR c++/95451 * lambda.c (is_lambda_ignored_entity): Before checking for LAMBDA_FUNCTION_P, use OVL_FIRST. Drop FUNCTION_DECL check. * g++.dg/cpp1y/lambda-generic-95451.C: New test. (cherry picked from commit 8f9308936cf1df134d5aac1f890eb67266530ab5)
[Bug tree-optimization/99225] [8/9 Regression] ICE in wide_int_to_tree_1, at tree.c:1644
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99225 --- Comment #7 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:7c9f7293c995b662457b4e7aba97a6faa4d86dc5 commit r9-9426-g7c9f7293c995b662457b4e7aba97a6faa4d86dc5 Author: Jakub Jelinek Date: Wed Feb 24 12:10:25 2021 +0100 fold-const: Fix up ((1 << x) & y) != 0 folding for vectors [PR99225] This optimization was written purely with scalar integers in mind, can work fine even with vectors, but we can't use build_int_cst but need to use build_one_cst instead. 2021-02-24 Jakub Jelinek PR tree-optimization/99225 * fold-const.c (fold_binary_loc) : In (x & (1 << y)) != 0 to ((x >> y) & 1) != 0 simplifications use build_one_cst instead of build_int_cst (..., 1). Formatting fixes. * gcc.c-torture/compile/pr99225.c: New test. (cherry picked from commit 4de402ab60c54fff48cb7371644b024d10d7e5bb)
[Bug tree-optimization/99204] [8/9 Regression] ICE in fold_read_from_constant_string, at fold-const.c:15441
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99204 --- Comment #11 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:aec805be1ba98f9d79b1f7be3236deacc2e63551 commit r9-9425-gaec805be1ba98f9d79b1f7be3236deacc2e63551 Author: Jakub Jelinek Date: Tue Feb 23 09:49:48 2021 +0100 fold-const: Fix ICE in fold_read_from_constant_string on invalid code [PR99204] fold_read_from_constant_string and expand_expr_real_1 have code to optimize constant reads from string (tree vs. rtl). If the STRING_CST array type has zero low bound, index is fold converted to sizetype and so the compare_tree_int works fine, but if it has some other low bound, it calls size_diffop_loc and that function from 2 sizetype operands creates a ssizetype difference. expand_expr_real_1 then uses tree_fits_uhwi_p + compare_tree_int and so works fine, but fold-const.c only checked if index is INTEGER_CST and calls compare_tree_int, which means for negative index it will succeed and result in UB in the compiler. 2021-02-23 Jakub Jelinek PR tree-optimization/99204 * fold-const.c (fold_read_from_constant_string): Check that tree_fits_uhwi_p (index) rather than just that index is INTEGER_CST. * gfortran.dg/pr99204.f90: New test. (cherry picked from commit f53a9b563b5017af179f1fd900189c0ba83aa2ec)
[Bug libstdc++/99181] char_traits (and thus string_view) compares strings differently in constexpr and non-constexpr contexts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99181 --- Comment #8 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:03e18a40070c6fe89397a73d85a38a99371cf8a1 commit r9-9424-g03e18a40070c6fe89397a73d85a38a99371cf8a1 Author: Jakub Jelinek Date: Tue Feb 23 09:30:18 2021 +0100 libstdc++: Fix up constexpr std::char_traits::compare [PR99181] Because of LWG 467, std::char_traits::lt compares the values cast to unsigned char rather than char, so even when char is signed we get unsigned comparision. std::char_traits::compare uses __builtin_memcmp and that works the same, but during constexpr evaluation we were calling __gnu_cxx::char_traits::compare. As char_traits::lt is not virtual, __gnu_cxx::char_traits::compare used __gnu_cxx::char_traits::lt rather than std::char_traits::lt and thus compared chars as signed if char is signed. This change fixes it by inlining __gnu_cxx::char_traits::compare into std::char_traits::compare by hand, so that it calls the right lt method. 2021-02-23 Jakub Jelinek PR libstdc++/99181 * include/bits/char_traits.h (char_traits::compare): For constexpr evaluation don't call __gnu_cxx::char_traits::compare but do the comparison loop directly. * testsuite/21_strings/char_traits/requirements/char/99181.cc: New test. (cherry picked from commit 311c57f6d8f285d69e44bf94152c753900cb1a0a)
[Bug ipa/99034] [9 Regression] error: EH landing pad label is not first in a sequence of labels in bb 6during GIMPLE pass: einline since r9-6254-gf86624d85f937e03
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99034 --- Comment #11 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:8a6114146001eafc1921b60e40bf5c0e4f4b8e64 commit r9-9423-g8a6114146001eafc1921b60e40bf5c0e4f4b8e64 Author: Jakub Jelinek Date: Fri Feb 19 12:14:39 2021 +0100 tree-cfg: Fix up gimple_merge_blocks FORCED_LABEL handling [PR99034] The verifiers require that DECL_NONLOCAL or EH_LANDING_PAD_NR labels are always the first label if there is more than one label. When merging blocks, we don't honor that though. On the following testcase, we try to merge blocks: [count: 0]: : S::~S (&s); and [count: 0]: : resx 1 where is landing pad and is FORCED_LABEL. And the code puts the FORCED_LABEL before the landing pad label, violating the verification requirements. The following patch fixes it by moving the FORCED_LABEL after the DECL_NONLOCAL or EH_LANDING_PAD_NR label if it is the first label. 2021-02-19 Jakub Jelinek PR ipa/99034 * tree-cfg.c (gimple_merge_blocks): If bb a starts with eh landing pad or non-local label, put FORCED_LABELs from bb b after that label rather than before it. * g++.dg/opt/pr99034.C: New test. (cherry picked from commit 33be24d77d3d8f0c992eb344ce63f78e14cf753d)
[Bug c/99136] ICE in gimplify_expr, at gimplify.c:14854
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99136 --- Comment #5 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:884f790b8414e377665706f51254e8bd49a72f0b commit r9-9422-g884f790b8414e377665706f51254e8bd49a72f0b Author: Jakub Jelinek Date: Thu Feb 18 22:17:52 2021 +0100 c: Fix ICE with -fexcess-precision=standard [PR99136] The following testcase ICEs on i686-linux, because c_finish_return wraps c_fully_folded retval back into EXCESS_PRECISION_EXPR, but when the function return type is void, we don't call convert_for_assignment on it that would then be fully folded again, but just put the retval into RETURN_EXPR's operand, so nothing removes it anymore and during gimplification we ICE as EXCESS_PRECISION_EXPR is not handled. This patch fixes it by not adding that EXCESS_PRECISION_EXPR in functions returning void, the return value is ignored and all we need is evaluate any side-effects of the expression. 2021-02-18 Jakub Jelinek PR c/99136 * c-typeck.c (c_finish_return): Don't wrap retval into EXCESS_PRECISION_EXPR in functions that return void. * gcc.dg/pr99136.c: New test. (cherry picked from commit 3d7ce7ce6c03165ca1041b38e02428c925254968)
[Bug sanitizer/99106] [9 Regression] ICE in tree_to_poly_int64, at tree.c:3091
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99106 --- Comment #6 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:89da2c8127c373573e5e486efe7699da794d469b commit r9-9421-g89da2c8127c373573e5e486efe7699da794d469b Author: Jakub Jelinek Date: Wed Feb 17 15:03:25 2021 +0100 c++: Fix up build_zero_init_1 once more [PR99106] My earlier build_zero_init_1 patch for flexible array members created an empty CONSTRUCTOR. As the following testcase shows, that doesn't work very well because the middle-end doesn't expect CONSTRUCTOR elements with incomplete type (that the empty CONSTRUCTOR at the end of outer CONSTRUCTOR had). The following patch just doesn't add any CONSTRUCTOR for the flexible array members, it doesn't seem to be needed. 2021-02-17 Jakub Jelinek PR sanitizer/99106 * init.c (build_zero_init_1): For flexible array members just return NULL_TREE instead of returning empty CONSTRUCTOR with non-complete ARRAY_TYPE. * g++.dg/ubsan/pr99106.C: New test. (cherry picked from commit af868e89ec21340d1cafd26eaed356ce4b0104c3)
[Bug tree-optimization/99079] [8/9 Regression] Maybe a wrong code since r6-1462-g4ab1e111ef0669bb
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99079 --- Comment #10 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:dfcddd8ed5c584f6eca6d918f8d88da2567d7350 commit r9-9420-gdfcddd8ed5c584f6eca6d918f8d88da2567d7350 Author: Jakub Jelinek Date: Mon Feb 15 09:16:06 2021 +0100 match.pd: Fix up A % (cast) (pow2cst << B) simplification [PR99079] The (mod @0 (convert?@3 (power_of_two_cand@1 @2))) simplification uses tree_nop_conversion_p (type, TREE_TYPE (@3)) condition, but I believe it doesn't check what it was meant to check. On convert?@3 TREE_TYPE (@3) is not the type of what it has been converted from, but what it has been converted to, which needs to be (because it is operand of normal binary operation) equal or compatible to type of the modulo result and first operand - type. I could fix that by using && tree_nop_conversion_p (type, TREE_TYPE (@1)) and be done with it, but actually most of the non-nop conversions are IMHO ok and so we would regress those optimizations. In particular, if we have say narrowing conversions (foo5 and foo6 in the new testcase), I think we are fine, either the shift of the power of two constant after narrowing conversion is still that power of two (or negation of that) and then it will still work, or the result of narrowing conversion is 0 and then we would have UB which we can ignore. Similarly, widening conversions where the shift result is unsigned are fine, or even widening conversions where the shift result is signed, but we sign extend to a signed wider divisor, the problematic case of INT_MIN will become x % (long long) INT_MIN and we can still optimize that to x & (long long) INT_MAX. What doesn't work is the case in the pr99079.c testcase, widening conversion of a signed shift result to wider unsigned divisor, where if the shift is negative, we end up with x % (unsigned long long) INT_MIN which is x % 0x8000ULL where the divisor is not a power of two and we can't optimize that to x & 0x7fffULL. So, the patch rejects only the single problematic case. Furthermore, when the shift result is signed, we were introducing UB into a program which previously didn't have one (well, left shift into the sign bit is UB in some language/version pairs, but it is definitely valid in C++20 - wonder if I shouldn't move the gcc.c-torture/execute/pr99079.c testcase to g++.dg/torture/pr99079.C and use -std=c++20), by adding that subtraction of 1, x % (1 << 31) in C++20 is well defined, but x & ((1 << 31) - 1) triggers UB on the subtraction. So, the patch performs the subtraction in the unsigned type if it isn't wrapping. 2021-02-15 Jakub Jelinek PR tree-optimization/99079 * match.pd (A % (pow2pcst << N) -> A & ((pow2pcst << N) - 1)): Remove useless tree_nop_conversion_p (type, TREE_TYPE (@3)) check. Instead require both type and TREE_TYPE (@1) to be integral types and either type having smaller or equal precision, or TREE_TYPE (@1) being unsigned type, or type being signed type. If TREE_TYPE (@1) doesn't have wrapping overflow, perform the subtraction of one in unsigned type. * gcc.dg/fold-modpow2-2.c: New test. * gcc.c-torture/execute/pr99079.c: New test. (cherry picked from commit 45de8afb2d534e3b38b4d1898686b20c29cc6a94)
[Bug c++/99033] [9 Regression] ICE in tree_to_poly_int64, at tree.c:3091
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99033 --- Comment #8 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:d1eaf74ee3ac503576e2830e77505cce1ee56e8d commit r9-9419-gd1eaf74ee3ac503576e2830e77505cce1ee56e8d Author: Jakub Jelinek Date: Thu Feb 11 17:24:17 2021 +0100 c++: Fix zero initialization of flexible array members [PR99033] array_type_nelts returns error_mark_node for type of flexible array members and build_zero_init_1 was placing an error_mark_node into the CONSTRUCTOR, on which e.g. varasm ICEs. I think there is nothing erroneous on zero initialization of flexible array members though, such arrays should simply get no elements, like they do if such classes are constructed (everything except when some larger initializer comes from an explicit initializer). So, this patch handles [] arrays in zero initialization like [0] arrays and fixes handling of the [0] arrays - the tree_int_cst_equal (max_index, integer_minus_one_node) check didn't do what it thought it would do, max_index is typically unsigned integer (sizetype) and so it is never equal to a -1. What the patch doesn't do and maybe would be desirable is if it returns error_mark_node for other reasons let the recursive callers not stick that into CONSTRUCTOR but return error_mark_node instead. But I don't have a testcase where that would be needed right now. 2021-02-11 Jakub Jelinek PR c++/99033 * init.c (build_zero_init_1): Handle zero initialiation of flexible array members like initialization of [0] arrays. Use integer_minus_onep instead of comparison to integer_minus_one_node and integer_zerop instead of comparison against size_zero_node. Formatting fixes. * g++.dg/ext/flexary38.C: New test. (cherry picked from commit ea535f59b19f65e5b313c990ee6c194a7b055bd7)
[Bug c++/99035] [9 Regression] ICE in declare_weak, at varasm.c:5930
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99035 --- Comment #7 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:6496e9154309cbd911f944147da1246628e393da commit r9-9418-g6496e9154309cbd911f944147da1246628e393da Author: Jakub Jelinek Date: Wed Feb 10 19:52:37 2021 +0100 varasm: Fix ICE with -fsyntax-only [PR99035] My FE change from 2 years ago uses TREE_ASM_WRITTEN in -fsyntax-only mode more aggressively to avoid "expanding" functions multiple times. With -fsyntax-only nothing is really expanded, so I think it is acceptable to adjust the assert and allow declare_weak at any time, with -fsyntax-only we know it is during parsing only anyway. 2021-02-10 Jakub Jelinek PR c++/99035 * varasm.c (declare_weak): For -fsyntax-only, allow even TREE_ASM_WRITTEN function decls. * g++.dg/ext/weak6.C: New test. (cherry picked from commit a964f494cd5a90f631b8c0c01777a9899e0351ce)
[Bug middle-end/99007] [8/9 Regression] ICE in dominated_by_p, at dominance.c:1124
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99007 --- Comment #9 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:024f7908cbdd82e4e23a06e6df47f3823e8bdf59 commit r9-9417-g024f7908cbdd82e4e23a06e6df47f3823e8bdf59 Author: Jakub Jelinek Date: Wed Feb 10 10:34:58 2021 +0100 openmp: Temporarily disable into_ssa when gimplifying OpenMP reduction clauses [PR99007] gimplify_scan_omp_clauses was already calling gimplify_expr with false as last argument to make sure it is not an SSA_NAME, but as the testcases show, that is not enough, SSA_NAME temporaries created during that gimplification can be reused too and we can't allow SSA_NAMEs to be used across OpenMP region boundaries, as we can only firstprivatize decls. Fixed by temporarily disabling into_ssa. 2021-02-10 Jakub Jelinek PR middle-end/99007 * gimplify.c (gimplify_scan_omp_clauses): For MEM_REF on reductions, temporarily disable gimplify_ctxp->into_ssa around gimplify_expr calls. * g++.dg/gomp/pr99007.C: New test. * gcc.dg/gomp/pr99007-1.c: New test. * gcc.dg/gomp/pr99007-2.c: New test. * gcc.dg/gomp/pr99007-3.c: New test. (cherry picked from commit deba6b20a3889aa23f0e4b3a5248de4172a0167d)
[Bug c++/97878] [8/9 Regression] ICE in cxx_eval_outermost_constant_expr, at cp/constexpr.c:6825
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97878 --- Comment #9 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:5c85df8968b00acc934396d7461a4a5ac6ddedd1 commit r9-9416-g5c85df8968b00acc934396d7461a4a5ac6ddedd1 Author: Jakub Jelinek Date: Fri Feb 5 10:22:07 2021 +0100 c++: Fix ICE with structured binding initialized to incomplete array [PR97878] We ICE on the following testcase, for incomplete array a on auto [b] { a }; without giving any kind of diagnostics, with auto [c] = a; during error-recovery. The problem is that we get too far through check_initializer and e.g. store_init_value -> constexpr stuff can't deal with incomplete array types. As the type of the structured binding artificial variable is always deduced, I think it is easiest to diagnose this early, even if they have array types we'll need their deduced type to be complete rather than just its element type. 2021-02-05 Jakub Jelinek PR c++/97878 * decl.c (check_array_initializer): For structured bindings, require the array type to be complete. * g++.dg/cpp1z/decomp54.C: New test. (cherry picked from commit 8b7f2d3eae16dd629ae7ae40bb76f4bb0099f441)
[Bug middle-end/97487] [8/9 Regression] ICE in expand_simple_binop, at optabs.c:939 since r8-3977
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97487 --- Comment #9 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:e55dc66ddefebef79f8d733ba6eb835c7b52d7ec commit r9-9415-ge55dc66ddefebef79f8d733ba6eb835c7b52d7ec Author: Jakub Jelinek Date: Wed Feb 3 09:09:26 2021 +0100 ifcvt: Avoid ICEs trying to force_operand random RTL [PR97487] As the testcase shows, RTL ifcvt can throw random RTL (whatever it found in some insns) at expand_binop or expand_unop and expects it to do something (and then will check if it created valid insns and punts if not). These functions in the end if the operands don't match try to copy_to_mode_reg the operands, which does if (!general_operand (x, VOIDmode)) x = force_operand (x, temp); but, force_operand is far from handling all possible RTLs, it will ICE for all more unusual RTL codes. Basically handles just simple arithmetic and unary RTL operations if they have an optab and expand_simple_binop/expand_simple_unop ICE on others. The following patch fixes it by adding some operand verification (whether there is a hope that copy_to_mode_reg will succeed on those). It is added both to noce_emit_move_insn (not needed for this exact testcase, that function simply tries to recog the insn as is and if it fails, handles some simple binop/unop cases; the patch performs the verification of their operands) and noce_try_sign_mask. 2021-02-03 Jakub Jelinek PR middle-end/97487 * ifcvt.c (noce_can_force_operand): New function. (noce_emit_move_insn): Use it. (noce_try_sign_mask): Likewise. Formatting fix. * gcc.dg/pr97487-1.c: New test. * gcc.dg/pr97487-2.c: New test. (cherry picked from commit 025a0ee3911c0866c69f841df24a558c7c8df0eb)
[Bug middle-end/97971] [9 Regression] ICE in process_alt_operands, at lra-constraints.c:3110
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97971 --- Comment #7 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:2f9a241a308c32108b922bd768b7576c5c34e440 commit r9-9414-g2f9a241a308c32108b922bd768b7576c5c34e440 Author: Jakub Jelinek Date: Wed Feb 3 09:07:36 2021 +0100 lra-constraints: Fix error-recovery for bad inline-asms [PR97971] The following testcase has ice-on-invalid, it can't be reloaded, but we shouldn't ICE the compiler because the user typed non-sense. In current_insn_transform we have: if (process_alt_operands (reused_alternative_num)) alt_p = true; if (check_only_p) return ! alt_p || best_losers != 0; /* If insn is commutative (it's safe to exchange a certain pair of operands) then we need to try each alternative twice, the second time matching those two operands as if we had exchanged them. To do this, really exchange them in operands. If we have just tried the alternatives the second time, return operands to normal and drop through. */ if (reused_alternative_num < 0 && commutative >= 0) { curr_swapped = !curr_swapped; if (curr_swapped) { swap_operands (commutative); goto try_swapped; } else swap_operands (commutative); } if (! alt_p && ! sec_mem_p) { /* No alternative works with reloads?? */ if (INSN_CODE (curr_insn) >= 0) fatal_insn ("unable to generate reloads for:", curr_insn); error_for_asm (curr_insn, "inconsistent operand constraints in an %"); lra_asm_error_p = true; ... and so handle inline asms there differently (and delete/nullify them after this) - fatal_insn is only called for non-inline asm. But in process_alt_operands we do: /* Both the earlyclobber operand and conflicting operand cannot both be user defined hard registers. */ if (HARD_REGISTER_P (operand_reg[i]) && REG_USERVAR_P (operand_reg[i]) && operand_reg[j] != NULL_RTX && HARD_REGISTER_P (operand_reg[j]) && REG_USERVAR_P (operand_reg[j])) fatal_insn ("unable to generate reloads for " "impossible constraints:", curr_insn); and thus ICE even for inline-asms. I think it is inappropriate to delete/nullify the insn in process_alt_operands, as it could be done e.g. in the check_only_p mode, so this patch just returns false in that case, which results in the caller have alt_p false, and as inline asm isn't simple move, sec_mem_p will be also false (and it isn't commutative either), so for check_only_p it will suggests to the callers it isn't ok and otherwise will emit error and delete/nullify the inline asm insn. 2021-02-03 Jakub Jelinek PR middle-end/97971 * lra-constraints.c (process_alt_operands): For inline asm, don't call fatal_insn, but instead return false. * gcc.target/i386/pr97971.c: New test. (cherry picked from commit 4dd7141653b57f638fc32291245d57d4dcfa3813)
[Bug debug/98331] [8/9 Regression] ICE in haifa_luid_for_non_insn, at haifa-sched.c:7845 since r8-5479-g67a8d7199fe4e474
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98331 --- Comment #9 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:b717d2be2c1ec177488d2af8c63f441810fd0e29 commit r9-9413-gb717d2be2c1ec177488d2af8c63f441810fd0e29 Author: Jakub Jelinek Date: Fri Jan 29 10:30:09 2021 +0100 expand: Fix up find_bb_boundaries [PR98331] When expansion emits some control flow insns etc. inside of a former GIMPLE basic block, find_bb_boundaries needs to split it into multiple basic blocks. The code needs to ignore debug insns in decisions how many splits to do or where in between some non-debug insns the split should be done, but it can decide where to put debug insns if they can be kept and otherwise throws them away (they can't stay outside of basic blocks). On the following testcase, we end up in the bb from expander with control flow insn debug insns barrier some other insn (the some other insn is effectively dead after __builtin_unreachable and we'll optimize that out later). Without debug insns, we'd do the split when encountering some other insn and split after PREV_INSN (some other insn), i.e. after barrier (and the splitting code then moves the barrier in between basic blocks). But if there are debug insns, we actually split before the first debug insn that appeared after the control flow insn, so after control flow insn, and get a basic block that starts with debug insns and then has a barrier in the middle that nothing moves it out of the bb. This leads to ICEs and even if it wouldn't, different behavior from -g0. The reason for treating debug insns that way is a different case, e.g. control flow insn debug insns some other insn or even control flow insn barrier debug insns some other insn where splitting before the first such debug insn allows us to keep them while otherwise we would have to drop them on the floor, and in those situations we behave the same with -g0 and -g. So, the following patch fixes it by resetting debug_insn not just when splitting the blocks (it is set only after seeing a control flow insn and before splitting for it if needed), but also when seeing a barrier, which effectively means we always throw away debug insns after a control flow insn and before following barrier if any, but there is no way around that, control flow insn must be the last in the bb (BB_END) and BARRIER after it, debug insns aren't allowed outside of bb. We still handle the other cases fine (when there is no barrier or when debug insns appear only after the barrier). 2021-01-29 Jakub Jelinek PR debug/98331 * cfgbuild.c (find_bb_boundaries): Reset debug_insn when seeing a BARRIER. * gcc.dg/pr98331.c: New test. (cherry picked from commit ea0e1eaa30f42e108f6c716745347cc1dcfdc475)
[Bug c++/33661] template methods forget explicit local register asm vars
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=33661 --- Comment #21 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:ef5db37cc4e80b229502bea7d6e2daa95ad6f805 commit r9-9412-gef5db37cc4e80b229502bea7d6e2daa95ad6f805 Author: Jakub Jelinek Date: Thu Jan 28 16:13:11 2021 +0100 c++: Fix up handling of register ... asm ("...") vars in templates [PR33661, PR98847] As the testcase shows, for vars appearing in templates, we don't attach the asm spec string to the pattern decls, nor pass it back to cp_finish_decl during instantiation. The following patch does that. 2021-01-28 Jakub Jelinek PR c++/33661 PR c++/98847 * decl.c (cp_finish_decl): For register vars with asmspec in templates call set_user_assembler_name and set DECL_HARD_REGISTER. * pt.c (tsubst_expr): When instantiating DECL_HARD_REGISTER vars, pass asmspec_tree to cp_finish_decl. * g++.target/i386/pr98847.C: New test. (cherry picked from commit cf93f94b3498f3925895fb0bbfd4b64232b9987a)
[Bug inline-asm/98847] Miscompilation with c++17, templates, and register keyword
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98847 --- Comment #9 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:ef5db37cc4e80b229502bea7d6e2daa95ad6f805 commit r9-9412-gef5db37cc4e80b229502bea7d6e2daa95ad6f805 Author: Jakub Jelinek Date: Thu Jan 28 16:13:11 2021 +0100 c++: Fix up handling of register ... asm ("...") vars in templates [PR33661, PR98847] As the testcase shows, for vars appearing in templates, we don't attach the asm spec string to the pattern decls, nor pass it back to cp_finish_decl during instantiation. The following patch does that. 2021-01-28 Jakub Jelinek PR c++/33661 PR c++/98847 * decl.c (cp_finish_decl): For register vars with asmspec in templates call set_user_assembler_name and set DECL_HARD_REGISTER. * pt.c (tsubst_expr): When instantiating DECL_HARD_REGISTER vars, pass asmspec_tree to cp_finish_decl. * g++.target/i386/pr98847.C: New test. (cherry picked from commit cf93f94b3498f3925895fb0bbfd4b64232b9987a)
[Bug sanitizer/95693] [8/9 Regression] Incorrect error from undefined behavior sanitizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95693 --- Comment #10 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:4ccdb3fdbc14102c91b6148bcbe09d0763726ae0 commit r9-9408-g4ccdb3fdbc14102c91b6148bcbe09d0763726ae0 Author: Jakub Jelinek Date: Fri Jan 22 19:03:23 2021 +0100 c++: Fix up ubsan false positives on references [PR95693] Alex' 2 years old change to build_zero_init_1 to return NULL pointer with reference type for references breaks the sanitizers, the assignment of NULL to a reference typed member is then instrumented before it is overwritten with a non-NULL address later on. That change has been done to fix error recovery ICE during process_init_constructor_record, where we: if (TYPE_REF_P (fldtype)) { if (complain & tf_error) error ("member %qD is uninitialized reference", field); else return PICFLAG_ERRONEOUS; } a few lines earlier, but then continue and ICE when build_zero_init returns NULL. The following patch reverts the build_zero_init_1 change and instead creates the NULL with reference type constants during the error recovery. The pr84593.C testcase Alex' change was fixing still works as before. 2021-01-22 Jakub Jelinek PR sanitizer/95693 * init.c (build_zero_init_1): Revert the 2018-03-06 change to return build_zero_cst for reference types. * typeck2.c (process_init_constructor_record): Instead call build_zero_cst here during error recovery instead of build_zero_init. * g++.dg/ubsan/pr95693.C: New test. (cherry picked from commit e5750f847158e7f9bdab770fd9c5fff58c5074d3)
[Bug target/98853] [9 Regression] wrong use of bfxil at -O1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98853 --- Comment #8 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:bde3846fb90f51b75685c9b6d677015daaec5f69 commit r9-9411-gbde3846fb90f51b75685c9b6d677015daaec5f69 Author: Jakub Jelinek Date: Wed Jan 27 20:35:21 2021 +0100 aarch64: Fix up *aarch64_bfxilsi_uxtw [PR98853] The https://gcc.gnu.org/legacy-ml/gcc-patches/2018-07/msg01895.html patch that introduced this pattern claimed: Would generate: combine_balanced_int: bfxil w0, w1, 0, 16 uxtwx0, w0 ret But with this patch generates: combine_balanced_int: bfxil w0, w1, 0, 16 ret and it is indeed what it should generate, but it doesn't do that, it emits bfxil x0, x1, 0, 16 instead which doesn't zero extend from 32 to 64 bits, but preserves the bits from the destination register. 2021-01-27 Jakub Jelinek PR target/98853 * config/aarch64/aarch64.md (*aarch64_bfxilsi_uxtw): Use %w0, %w1 and %2 instead of %0, %1 and %2. * gcc.c-torture/execute/pr98853-1.c: New test. * gcc.c-torture/execute/pr98853-2.c: New test. (cherry picked from commit 2a2c1e22c2501457608f12d5ab560caaca59c425)
[Bug target/98681] [8/9 Regression] aarch64: Invalid ubfiz instruction rejected by assembler
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98681 --- Comment #13 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:e3dc765eb4556b78fe52d32be9858f2805b4488d commit r9-9410-ge3dc765eb4556b78fe52d32be9858f2805b4488d Author: Jakub Jelinek Date: Tue Jan 26 14:48:26 2021 +0100 aarch64: Tighten up checks for ubfix [PR98681] The testcase in the patch doesn't assemble, because the instruction requires that the penultimate operand (lsb) range is [0, 32] (or [0, 64]) and the last operand's range is [1, 32 - lsb] (or [1, 64 - lsb]). The INTVAL (shft_amnt) < GET_MODE_BITSIZE (mode) will accept the lsb operand to be in range [MIN, 32] (or [MIN, 64]) and then we invoke UB in the compiler and sometimes it will make it through. The patch changes all the INTVAL uses in that function to UINTVAL, which isn't strictly necessary, but can be done (e.g. after the UINTVAL (shft_amnt) < GET_MODE_BITSIZE (mode) check we know it is not negative and thus INTVAL (shft_amnt) and UINTVAL (shft_amnt) then behave the same. But, I had to add INTVAL (mask) > 0 check in that case, otherwise we risk (hypothetically) emitting instruction that doesn't assemble. The problem is with masks that have the MSB bit set, while the instruction can handle those, e.g. ubfiz w1, w0, 13, 19 will do (w0 << 13) & 0xe000 in RTL we represent SImode constants with MSB set as negative HOST_WIDE_INT, so it will actually be HOST_WIDE_INT_C (0xe000), and the instruction uses %P3 to print the last operand, which calls asm_fprintf (f, "%u", popcount_hwi (INTVAL (x))) to print that. But that will not print 19, but 51 instead, will include there also all the copies of the sign bit. Not supporting those masks with MSB set isn't a big loss though, they really shouldn't appear normally, as both GIMPLE and RTL optimizations should optimize those away (one isn't masking any bits off with such masks, so just w0 << 13 will do too). 2021-01-26 Jakub Jelinek PR target/98681 * config/aarch64/aarch64.c (aarch64_mask_and_shift_for_ubfiz_p): Use UINTVAL (shft_amnt) and UINTVAL (mask) instead of INTVAL (shft_amnt) and INTVAL (mask). Add && INTVAL (mask) > 0 condition. * gcc.c-torture/execute/pr98681.c: New test. (cherry picked from commit fb09d7242a25971b275292332337a56b86637f2c)
[Bug middle-end/90248] [8/9 Regression] larger than 0 compare fails with -ffinite-math-only -funsafe-math-optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90248 --- Comment #20 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:45a6eae129c6fee387a3bb7075181d8509fa6e2a commit r9-9407-g45a6eae129c6fee387a3bb7075181d8509fa6e2a Author: Jakub Jelinek Date: Fri Jan 22 11:50:18 2021 +0100 match.pd: Replace incorrect simplifications into copysign [PR90248] In the PR Andrew said he has implemented a simplification that has been added to LLVM, but that actually is not true, what is in there are X * (X cmp 0.0 ? +-1.0 : -+1.0) simplifications into +-abs(X) but what has been added into GCC are (X cmp 0.0 ? +-1.0 : -+1.0) simplifications into copysign(1, +-X) and then X * copysign (1, +-X) into +-abs (X). The problem is with the (X cmp 0.0 ? +-1.0 : -+1.0) simplifications, they don't work correctly when X is zero. E.g. (X > 0.0 ? 1.0 : -1.0) is -1.0 when X is either -0.0 or 0.0, but copysign will make it return 1.0 for 0.0 and -1.0 only for -0.0. (X >= 0.0 ? 1.0 : -1.0) is 1.0 when X is either -0.0 or 0.0, but copysign will make it return still 1.0 for 0.0 and -1.0 for -0.0. The simplifications were guarded on !HONOR_SIGNED_ZEROS, but as discussed in the PR, that option doesn't mean that -0.0 will not ever appear as operand of some operation, it is hard to guarantee that without compiler adding canonicalizations of -0.0 to 0.0 after most of the operations and thus making it very slow, but that the user asserts that he doesn't care if the result of operations will be 0.0 or -0.0. Not to mention that some of the transformations are incorrect even for positive 0.0. So, instead of those simplifications this patch recognizes patterns where those ?: expressions are multiplied by X, directly into +-abs. That works fine even for 0.0 and -0.0 (as long as we don't care about whether the result is exactly 0.0 or -0.0 in those cases), because whether the result of copysign is -1.0 or 1.0 doesn't matter when it is multiplied by 0.0 or -0.0. As a follow-up, maybe we should add the simplification mentioned in the PR, in particular doing copysign by hand through VIEW_CONVERT_EXPR < 0 ? -float_constant : float_constant into copysign (float_constant, float_X). But I think that would need to be done in phiopt. 2021-01-22 Jakub Jelinek PR tree-optimization/90248 * match.pd (X cmp 0.0 ? 1.0 : -1.0 -> copysign(1, +-X), X cmp 0.0 ? -1.0 : +1.0 -> copysign(1, -+X)): Remove simplifications. (X * (X cmp 0.0 ? 1.0 : -1.0) -> +-abs(X), X * (X cmp 0.0 ? -1.0 : 1.0) -> +-abs(X)): New simplifications. * gcc.dg/tree-ssa/copy-sign-1.c: Don't expect any copysign builtins. * gcc.dg/pr90248.c: New test. (cherry picked from commit dd92986ea6d2d363146e1726817a84910453fdc8)
[Bug testsuite/97301] [11 regression] gcc.target/powerpc/sse-movlps-1.c fails after r11-3434
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97301 --- Comment #6 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:669d843e4f8e40bcf2cddce92cf8acc3d4851bc7 commit r9-9409-g669d843e4f8e40bcf2cddce92cf8acc3d4851bc7 Author: Jakub Jelinek Date: Sat Jan 23 09:41:58 2021 +0100 rs6000: Fix up __m64 typedef in mmintrin.h [PR97301] The x86 __m64 type is defined as: /* The Intel API is flexible enough that we must allow aliasing with other vector types, and their scalar components. */ typedef int __m64 __attribute__ ((__vector_size__ (8), __may_alias__)); and so matches the comment above it in that reads and stores through pointers to __m64 can alias anything. But in the rs6000 headers that is the case only for __m128, but not __m64. The following patch adds that attribute, which fixes the FAIL: gcc.target/powerpc/sse-movhps-1.c execution test FAIL: gcc.target/powerpc/sse-movlps-1.c execution test regressions that appeared when Honza improved ipa-modref. 2021-01-23 Jakub Jelinek PR testsuite/97301 * config/rs6000/mmintrin.h (__m64): Add __may_alias__ attribute. (cherry picked from commit db9a3ce7b83ce3ed3e0ffe7eb7a918595640e161)
[Bug c++/98672] constexpr function - for loop with return statement doesn't get recognized as constexpr subexpression
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98672 --- Comment #5 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:2fe1131465af7352be9e03773c30a3f6059af993 commit r9-9406-g2fe1131465af7352be9e03773c30a3f6059af993 Author: Jakub Jelinek Date: Thu Jan 21 17:20:24 2021 +0100 c++: Fix up potential_constant_expression_1 FOR/WHILE_STMT handling [PR98672] The following testcase is rejected even when it is valid. The problem is that potential_constant_expression_1 doesn't have the accurate *jump_target tracking cxx_eval_* has, and when the loop has a condition that isn't guaranteed to be always true, the body isn't walked at all. That is mostly a correct conservative behavior, except that it doesn't detect if there are any return statements in the body, which means the loop might return instead of falling through to the next statement. We already have code for return stmt discovery in code snippets we don't try to evaluate for switches, so this patch reuses that for FOR_STMT and WHILE_STMT bodies. Note, I haven't touched FOR_EXPR, with statement expressions it could have return stmts in it too, or it could have break or continue statements that wouldn't bind to the current loop but to something outer. That case is clearly mishandled by potential_constant_expression_1 even when the condition is missing or is always true, and it wouldn't surprise me if cxx_eval_* didn't handle it right either, so I'm deferring that to separate PR for later. We'd need proper test coverage for all of that. > Hmm, IF_STMT probably also needs to check the else clause, if the condition > isn't a known constant. You're right, I thought it was ok because it recurses with tf_none, but if the then branch is potentially constant and only else returns, continues or breaks, then as the enhanced testcase shows we were mishandling it too. 2021-01-21 Jakub Jelinek PR c++/98672 * constexpr.c (check_for_return_continue_data): Add break_stmt member. (check_for_return_continue): Also look for BREAK_STMT. Handle SWITCH_STMT by ignoring break_stmt from its body. (potential_constant_expression_1) , : If the condition isn't constant true, check if the loop body can contain a return stmt. : Adjust check_for_return_continue_data initializer. : If recursion with tf_none is successful, merge *jump_target from the branches - returns with highest priority, breaks or continues lower. If then branch is potentially constant and doesn't return, check the else branch if it could return, break or continue. * g++.dg/cpp1y/constexpr-98672.C: New test. (cherry picked from commit 8182cbe3fb2c2d20e8dff9d2476fb94046e560b3)
[Bug c++/98556] [8/9 Regression] ICE: 'verify_gimple' failed since r8-4821-g1af4ebf5985ef2aa
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98556 --- Comment #10 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:ab4f73ae118a57bc50d72c727609d11540447d28 commit r9-9405-gab4f73ae118a57bc50d72c727609d11540447d28 Author: Jakub Jelinek Date: Sat Jan 9 10:49:38 2021 +0100 tree-cfg: Allow enum types as result of POINTER_DIFF_EXPR [PR98556] As conversions between signed integers and signed enums with the same precision are useless in GIMPLE, it seems strange that we require that POINTER_DIFF_EXPR result must be INTEGER_TYPE. If we really wanted to require that, we'd need to change the gimplifier to ensure that, which it isn't the case on the following testcase. What is going on during the gimplification is that when we have the (enum T) (p - q) cast, it is stripped through /* Strip away as many useless type conversions as possible at the toplevel. */ STRIP_USELESS_TYPE_CONVERSION (*expr_p); and when the MODIFY_EXPR is gimplified, the *to_p has enum T type, while *from_p has intptr_t type and as there is no conversion in between, we just create GIMPLE_ASSIGN from that. 2021-01-09 Jakub Jelinek PR c++/98556 * tree-cfg.c (verify_gimple_assign_binary): Allow lhs of POINTER_DIFF_EXPR to be any integral type. * c-c++-common/pr98556.c: New test. (cherry picked from commit 0188eab844eacda5edc6257771edb771844ae069)
[Bug tree-optimization/98474] [8/9 Regression] incorrect results using __uint128_t
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98474 --- Comment #10 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:a750dcc2f404d1785597848c166a1932739def7f commit r9-9404-ga750dcc2f404d1785597848c166a1932739def7f Author: Jakub Jelinek Date: Thu Dec 31 11:06:56 2020 +0100 wide-int: Fix wi::to_mpz [PR98474] The following testcase is miscompiled, because niter analysis miscomputes the number of iterations to 0. The problem is that niter analysis uses mpz_t (wonder why, wouldn't widest_int do the same job?) and when wi::to_mpz is called e.g. on the TYPE_MAX_VALUE of __uint128_t, it initializes the mpz_t result with wrong value. wi::to_mpz has code to handle negative wide_ints in signed types by inverting all bits, importing to mpz and complementing it, which is fine, but doesn't handle correctly the case when the wide_int's len (times HOST_BITS_PER_WIDE_INT) is smaller than precision when wi::neg_p. E.g. the 0x TYPE_MAX_VALUE is represented in wide_int as 0x len 1, and wi::to_mpz would create 0x mpz_t value from that. This patch handles it by adding the needed -1 host wide int words (and has also code to deal with precision that aren't multiple of HOST_BITS_PER_WIDE_INT). 2020-12-31 Jakub Jelinek PR tree-optimization/98474 * wide-int.cc (wi::to_mpz): If wide_int has MSB set, but type is unsigned and excess negative, append set bits after len until precision. * gcc.c-torture/execute/pr98474.c: New test. (cherry picked from commit a4d191d08c6acb24034af4182b3524e6ef97546c)
[Bug c++/98353] [8/9 Regression] ICE in propagate_necessity, at tree-ssa-dce.c:1053 since r6-4886-gcda0a029f45d20f4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98353 --- Comment #9 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:202240b05f28681053c64efbf1e6deb07f36e1b8 commit r9-9403-g202240b05f28681053c64efbf1e6deb07f36e1b8 Author: Jakub Jelinek Date: Tue Dec 22 00:01:34 2020 +0100 gimplify: Gimplify value in gimplify_init_ctor_eval_range [PR98353] gimplify_init_ctor_eval_range wasn't gimplifying value, so if it wasn't a gimple val, verification at the end of gimplification would ICE (or with release checking some random pass later on would ICE or misbehave). 2020-12-21 Jakub Jelinek PR c++/98353 * gimplify.c (gimplify_init_ctor_eval_range): Gimplify value before storing it into cref. * g++.dg/opt/pr98353.C: New test. (cherry picked from commit f3113a85f098df8165624321cc85d20219fb2ada)
[Bug middle-end/98183] [8/9 Regression] ICE in expand_gimple_stmt_1, at cfgexpand.c:3972
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98183 --- Comment #11 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:eca61f3b10c177258f09c28613062d2adb588984 commit r9-9401-geca61f3b10c177258f09c28613062d2adb588984 Author: Jakub Jelinek Date: Sat Dec 12 08:36:02 2020 +0100 openmp, openacc: Fix up handling of data regions [PR98183] While the data regions (target data and OpenACC counterparts) aren't standalone directives, unlike most other OpenMP/OpenACC constructs we allow (apparently as an extension) exceptions and goto out of the block. During gimplification we place an *end* call into a finally block so that it is reached even on exceptions or goto out etc.). During omplower pass we then add paired #pragma omp return for them, but due to the exceptions because the region is not SESE we can end up with #pragma omp return appearing only conditionally in the CFG etc., which the ompexp pass can't handle. For the ompexp pass, we actually don't care about the end part or about target data nesting, so we can treat it as standalone directive. 2020-12-12 Jakub Jelinek PR middle-end/98183 * omp-low.c (lower_omp_target): Don't add OMP_RETURN for data regions. * omp-expand.c (expand_omp_target): Don't try to remove OMP_RETURN for data regions. (build_omp_regions_1, omp_make_gimple_edges): Don't expect OMP_RETURN for data regions. * gcc.dg/gomp/pr98183.c: New test. * gcc.dg/goacc/pr98183.c: New test. (cherry picked from commit 8c1ed7223ad1bc19ed9c936ba496220c8ef673bc)
[Bug middle-end/98205] ICE in expand_omp_for_generic, at omp-expand.c:4307
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98205 --- Comment #4 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:6f0a0d1c2bb7ad2020852ccf14ca86967ddb134a commit r9-9400-g6f0a0d1c2bb7ad2020852ccf14ca86967ddb134a Author: Jakub Jelinek Date: Thu Dec 10 11:07:07 2020 +0100 openmp: Fix ICE with broken doacross loop [PR98205] If the loop body doesn't ever continue, we don't have a bb to insert the updates. Fixed by not adding them at all in that case. 2020-12-10 Jakub Jelinek PR middle-end/98205 * omp-expand.c (expand_omp_for_generic): Fix up broken_loop handling. * c-c++-common/gomp/doacross-4.c: New test. (cherry picked from commit c925d4cebf817905c237aa2d93887f254b4a74f4)
[Bug c++/98187] ICE in build_call_expr_loc_array, at tree.c:11554
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98187 --- Comment #6 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:22b900e2db91095414832a83ae5761e689c676e7 commit r9-9399-g22b900e2db91095414832a83ae5761e689c676e7 Author: Jakub Jelinek Date: Tue Dec 8 10:45:30 2020 +0100 openmp: -fopenmp-simd fixes [PR98187] This patch fixes two bugs in the -fopenmp-simd support. One is that in C++ #pragma omp parallel master would actually create OMP_PARALLEL in the IL, which is a big no-no for -fopenmp-simd, we should be creating only the constructs -fopenmp-simd handles (mainly OMP_SIMD, OMP_LOOP which is gimplified as simd in that case, declare simd/reduction and ordered simd). The other bug was that #pragma omp master taskloop simd combined construct contains simd and thus should be recognized as #pragma omp simd (with only the simd applicable clauses), but as master wasn't included in omp_pragmas_simd, we'd ignore it completely instead. 2020-12-08 Jakub Jelinek PR c++/98187 * c-pragma.c (omp_pragmas): Remove "master". (omp_pragmas_simd): Add "master". * parser.c (cp_parser_omp_parallel): For parallel master with -fopenmp-simd only, just call cp_parser_omp_master instead of wrapping it in OMP_PARALLEL. * c-c++-common/gomp/pr98187.c: New test. (cherry picked from commit e315ba968d2a47643a9487ea48d62e6399a07d49)
[Bug target/98100] ICE in expand_debug_locations, at cfgexpand.c:5610
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98100 --- Comment #9 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:c47d4bddf3b08f833b2a59cc0be40234fe10e6bc commit r9-9398-gc47d4bddf3b08f833b2a59cc0be40234fe10e6bc Author: Jakub Jelinek Date: Fri Dec 4 12:18:21 2020 +0100 debug: Fix another vector DECL_MODE ICE [PR98100] The PR88587 fix changes DECL_MODE of vars with vector type during inlining/cloning when the vars are copied, so that their DECL_MODE matches their TYPE_MODE in the new function. Unfortunately, the following testcase still ICEs, the var isn't really used in the new function and so it isn't copied, but becomes just a nonlocalized var. So we can't adjust its DECL_MODE because it appears in multiple functions and needs different modes in between them. The following patch changes the DEBUG_INSN creation to use TYPE_MODE instead of DECL_MODE for vars with vector types. 2020-12-04 Jakub Jelinek PR target/98100 * cfgexpand.c (expand_gimple_basic_block): For vars with vector type, use TYPE_MODE rather than DECL_MODE. * gcc.target/i386/pr98100.c: New test. (cherry picked from commit 4c18faa4dd4dffdb76bc879b774ce3f4da01)
[Bug c++/98072] [9 Regression] ICE in cp_parser_omp_var_list_no_open, at cp/parser.c:34843
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98072 --- Comment #8 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:705afe9b40aabbe395baa2979cdac0a9fef194ef commit r9-9396-g705afe9b40aabbe395baa2979cdac0a9fef194ef Author: Jakub Jelinek Date: Tue Dec 1 21:41:44 2020 +0100 openmp: Avoid ICE on depend clause on depobj OpenMP construct [PR98072] Since r11-5430 we ICE on the following testcase. When parsing the depobj directive we don't really use cp_parser_omp_all_clauses routine which ATM disables generation of location wrappers and the newly added assertion that there are no location wrappers thus triggers. Fixed by adding the location wrappers suppression sentinel. Longer term, we should handle location wrappers inside of OpenMP clauses. 2020-12-01 Jakub Jelinek PR c++/98072 * parser.c (cp_parser_omp_depobj): Suppress location wrappers when parsing depend clause. * c-c++-common/gomp/depobj-2.c: New test. (cherry picked from commit d62daad11b21a2ee9c39a43c5e94e7b966793dbd)
[Bug target/98063] Emit R_X86_64_GOTOFF64 instead of R_X86_64_GOTPCRELX for -mcmodel=large -fno-plt
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98063 --- Comment #4 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:4b35e830e7161cd6a453a26c7c4407b477311e65 commit r9-9395-g4b35e830e7161cd6a453a26c7c4407b477311e65 Author: Jakub Jelinek Date: Tue Dec 1 10:44:40 2020 +0100 x86_64: Fix up -fpic -mcmodel=large -fno-plt [PR98063] On the following testcase with -fpic -mcmodel=large -fno-plt we emit call puts@GOTPCREL(%rip) but that is not really appropriate for CM_LARGE_PIC, the .text can be larger than 2GB in that case and the .got slot further away from %rip than what can fit into the signed 32-bit immediate. The following patch computes the address of the .got slot the way it is computed for that model for function pointer loads, and calls that. 2020-12-01 Jakub Jelinek PR target/98063 * config/i386/i386.c (ix86_expand_call): Handle non-plt CM_LARGE_PIC calls. * gcc.target/i386/pr98063.c: New test. (cherry picked from commit ebc8606a9408623e2fa2a02a5526b882ffd0e7a8)