[Bug sanitizer/115619] [ASAN] new-delete-type-mismatch on aligned operator new

2024-06-24 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115619 --- Comment #1 from Thiago Macieira --- Matching Clang bug report: https://github.com/llvm/llvm-project/issues/96512

[Bug sanitizer/115619] New: [ASAN] new-delete-type-mismatch on aligned operator new

2024-06-24 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115619 Bug ID: 115619 Summary: [ASAN] new-delete-type-mismatch on aligned operator new Product: gcc Version: 14.1.0 Status: UNCONFIRMED Severity: normal

[Bug target/114576] [14 regression] VEX-prefixed AES instruction without AVX enabled

2024-04-03 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114576 --- Comment #4 from Thiago Macieira --- (In reply to Jakub Jelinek from comment #3) > vaesenc etc. instructions can be used even if just -maes -mavx, not just > -mvaes -mavx512vl. Correct, that's just VEX-prefixed AESNI instructions. VAES

[Bug target/114576] New: [13 regression][config/i386] GCC 14/trunk emits VEX-prefixed AES instruction without AVX enabled

2024-04-03 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114576 Bug ID: 114576 Summary: [13 regression][config/i386] GCC 14/trunk emits VEX-prefixed AES instruction without AVX enabled Product: gcc Version: 14.0 Status: UNCONFIRMED

[Bug c/114088] Please provide __builtin_c16slen and __builtin_c32slen to complement __builtin_wcslenw

2024-02-24 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114088 --- Comment #3 from Thiago Macieira --- > But __builtin_strlen *does* get optimized when the input is a string literal. > Not sure about wcslen though. It appears not to, in the test above. std::char_trait::length() calls wcslen() whereas

[Bug c/114088] New: Please provide __builtin_c16slen and __builtin_c32slen to complement __builtin_wcslenw

2024-02-24 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114088 Bug ID: 114088 Summary: Please provide __builtin_c16slen and __builtin_c32slen to complement __builtin_wcslenw Product: gcc Version: unknown Status: UNCONFIRMED

[Bug target/113465] [mingw-w64] dllexported constexpr (inline) variables not automatically emitted

2024-02-03 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113465 --- Comment #6 from Thiago Macieira --- Mind if I ask you reconsider the decision for inline variables (which all constexpr ones are)?

[Bug c++/54483] undefined reference to static constexpr in .so

2024-01-17 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54483 --- Comment #13 from Thiago Macieira --- (In reply to Andrew Pinski from comment #11) > You still need: > constexpr float A::val; In C++11 mode, yes. C++17 made all static constexpr data members implicitly inline, which change the situation.

[Bug target/113465] [mingw-w64] dllexported constexpr (inline) variables not automatically emitted

2024-01-17 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113465 --- Comment #5 from Thiago Macieira --- > I don't think that's the same. That situation over there is C++11, where the > constexpr variable is *not* static. I meant not *inline*.

[Bug target/113465] [mingw-w64] dllexported constexpr (inline) variables not automatically emitted

2024-01-17 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113465 --- Comment #4 from Thiago Macieira --- (In reply to Andrew Pinski from comment #3) > See PR 54483 . > > *** This bug has been marked as a duplicate of bug 54483 *** I don't think that's the same. That situation over there is C++11, where the

[Bug target/113465] New: [mingw-w64] dllexported constexpr (inline) variables not automatically emitted

2024-01-17 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113465 Bug ID: 113465 Summary: [mingw-w64] dllexported constexpr (inline) variables not automatically emitted Product: gcc Version: 13.2.1 Status: UNCONFIRMED

[Bug libstdc++/111244] std::filesystem::path encoding mismatches locale on Windows

2023-08-30 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111244 --- Comment #7 from Thiago Macieira --- (In reply to Costas Argyris from comment #6) > At this point I just meant embedding it in your example a.out executable > file, just to check if it will work correctly. Ah, got it. But that is not the

[Bug libstdc++/111244] std::filesystem::path encoding mismatches locale on Windows

2023-08-30 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111244 --- Comment #5 from Thiago Macieira --- (In reply to Jonathan Wakely from comment #3) > Somebody else will have to fix this, I've already wasted too much of my life > making std:: filesystem (mostly) work on Windows. Same here. (In reply to

[Bug libstdc++/111244] std::filesystem::path encoding mismatches locale on Windows

2023-08-30 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111244 --- Comment #2 from Thiago Macieira --- (In reply to Andrew Pinski from comment #1) > Except the code page could be tuned via a manifest file even. > For an example GCC embeds a manifest into its own compiler to work around > this issue and

[Bug c++/111244] New: std::filesystem::path encoding mismatches locale on Windows

2023-08-30 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111244 Bug ID: 111244 Summary: std::filesystem::path encoding mismatches locale on Windows Product: gcc Version: 13.2.1 Status: UNCONFIRMED Severity: normal

[Bug c++/111105] New: [12/13/14 regression] __attribute__((malloc)) can no longer name a C++ member function

2023-08-22 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=05 Bug ID: 05 Summary: [12/13/14 regression] __attribute__((malloc)) can no longer name a C++ member function Product: gcc Version: 14.0 Status: UNCONFIRMED

[Bug target/110591] New: [i386] (Maybe) Missed optimisation: _cmpccxadd sets flags

2023-07-07 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110591 Bug ID: 110591 Summary: [i386] (Maybe) Missed optimisation: _cmpccxadd sets flags Product: gcc Version: 13.1.1 Status: UNCONFIRMED Severity: normal

[Bug target/110184] New: [i386] Missed optimisation: atomic operations should use PF, ZF and SF

2023-06-08 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110184 Bug ID: 110184 Summary: [i386] Missed optimisation: atomic operations should use PF, ZF and SF Product: gcc Version: 13.1.1 Status: UNCONFIRMED Severity:

[Bug target/109896] Missed optimisation: overflow detection in multiplication instructions for operator new

2023-05-18 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109896 --- Comment #7 from Thiago Macieira --- (In reply to Jonathan Wakely from comment #6) > With placement-new there's no allocation: > https://gcc.godbolt.org/z/68e4PaeYz Is the exception expected there, though?

[Bug target/109896] Missed optimisation: overflow detection in multiplication instructions for operator new

2023-05-17 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109896 --- Comment #5 from Thiago Macieira --- (In reply to Andrew Pinski from comment #4) > If you are that picky for cycles, these cycles are not going to be a problem > compared to the dynamic allocation that is just about to happen .. Yeah, I

[Bug target/109896] Missed optimisation: overflow detection in multiplication instructions for operator new

2023-05-17 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109896 --- Comment #3 from Thiago Macieira --- (In reply to H.J. Lu from comment #2) > (In reply to Andrew Pinski from comment #1) > > I suspect the overflow code was added before __builtin_*_overflow were added > > which is why the generated code is

[Bug tree-optimization/106409] GCC with LTO: Warning: argument 1 value ‘18...615’ (SIZE_MAX) exceeds maximum object size with new

2023-05-17 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106409 --- Comment #8 from Thiago Macieira --- (In reply to Andrew Pinski from comment #7) > See PR 58525 also which added that code path. That explains why it won't call __cxa_throw_bad_array_new_length, but not why it will call operator new[](-1).

[Bug tree-optimization/106409] GCC with LTO: Warning: argument 1 value ‘18...615’ (SIZE_MAX) exceeds maximum object size with new

2023-05-17 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106409 --- Comment #6 from Thiago Macieira --- Suggestion: add a function to libgcc to be called instead of __cxa_throw_bad_array_new_length when exceptions are disabled. That function can be a mere two instructions, but it provides two advantages: *

[Bug target/109896] New: Missed optimisation: overflow detection in multiplication instructions for operator new

2023-05-17 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109896 Bug ID: 109896 Summary: Missed optimisation: overflow detection in multiplication instructions for operator new Product: gcc Version: 13.1.1 Status: UNCONFIRMED

[Bug c++/109895] New: -Walloc-size-larger-than complains about code it generated itself under -flto -fno-exceptions

2023-05-17 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109895 Bug ID: 109895 Summary: -Walloc-size-larger-than complains about code it generated itself under -flto -fno-exceptions Product: gcc Version: 13.1.1 Status: UNCONFIRMED

[Bug libstdc++/99277] C++2a synchronisation is inefficient in GCC 11

2023-05-08 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99277 --- Comment #21 from Thiago Macieira --- I understand that. I don't think it's a reason to repeat the policy, though. Anyway, I don't have any new arguments than when we discussed this two years ago, so I won't pursue this matter further.

[Bug libstdc++/99277] C++2a synchronisation is inefficient in GCC 11

2023-05-08 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99277 --- Comment #19 from Thiago Macieira --- (In reply to Jonathan Wakely from comment #18) > We have not committed to a stable ABI for C++20 yet. That was my argument when creating this bug report two years ago: if it's available in the standard

[Bug libstdc++/99277] C++2a synchronisation is inefficient in GCC 11

2023-05-08 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99277 --- Comment #17 from Thiago Macieira --- (In reply to Thomas Rodgers from comment #16) > The original implementation came from Olvier Giroux and is part of libc++. > The libc++ implementation also does not use a type that futex or >

[Bug libstdc++/99277] C++2a synchronisation is inefficient in GCC 11

2023-05-08 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99277 --- Comment #15 from Thiago Macieira --- > > 5) std::barrier implementation also uses a type that futex(2) can't handle > barrier still uses a 1-byte enum for the atomic waits. That can only now be fixed for libstdc++.so.7, then.

[Bug libstdc++/99277] C++2a synchronisation is inefficient in GCC 11

2023-05-08 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99277 Thiago Macieira changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED

[Bug tree-optimization/108980] [13 Regression] Warning text missing the warning itself (GCC 13)

2023-03-01 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108980 --- Comment #9 from Thiago Macieira --- Ah, got it. That also explains why I couldn't find anything wrong with my code, and nothing I did that could likely be it made the warning go away. Thanks for the quick turnaround.

[Bug tree-optimization/108980] [13 Regression] Warning text missing the warning itself (GCC 13)

2023-03-01 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108980 --- Comment #7 from Thiago Macieira --- The duplicate "note:" disappeared. But now there's no warning at all on the same file, with the same options. Was that intended?

[Bug tree-optimization/108980] [13 Regression] Warning text missing the warning itself (GCC 13)

2023-03-01 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108980 --- Comment #6 from Thiago Macieira --- Testing.

[Bug c++/108980] Warning text missing the warning itself (GCC 13)

2023-02-28 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108980 --- Comment #1 from Thiago Macieira --- GCC 13 (trunk) built today.

[Bug c++/108980] New: Warning text missing the warning itself (GCC 13)

2023-02-28 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108980 Bug ID: 108980 Summary: Warning text missing the warning itself (GCC 13) Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3

[Bug preprocessor/108372] New: [12 regression] -E -fdirectives-only crash

2023-01-11 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108372 Bug ID: 108372 Summary: [12 regression] -E -fdirectives-only crash Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component:

[Bug target/98112] Add -f[no-]direct-access-external-data & drop HAVE_LD_PIE_COPYRELOC

2023-01-04 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98112 --- Comment #9 from Thiago Macieira --- I can't be certain for other architectures' performance, but my feeling is that indeed they would benefit from this. The option that was added as an -m should be an -f (and match Clang's option). However,

[Bug c++/108216] Wrong offset for (already-constructed) virtual base during construction of full object

2022-12-23 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108216 --- Comment #3 from Thiago Macieira --- In bug 70644, the pointer to Base was passed to Base's constructor, so the conversion from the derived type to the virtual base Base happened clearly before said base was constructed. In this example

[Bug tree-optimization/104475] [12/13 Regression] Wstringop-overflow + atomics incorrect warning on dynamic object

2022-12-06 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104475 --- Comment #19 from Thiago Macieira --- (In reply to Richard Biener from comment #15) > Thanks, it's still the same reason - we isolate a nullptr case and end up > with > > __atomic_or_fetch_4 (184B, 64, 0); [tail call] > > The path we

[Bug tree-optimization/104475] [12/13 Regression] Wstringop-overflow + atomics incorrect warning on dynamic object

2022-12-05 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104475 --- Comment #14 from Thiago Macieira --- Created attachment 54015 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54015=edit qfutureinterface.cpp preprocessed [gcc trunk-20221205] (In reply to Richard Biener from comment #13) > There's

[Bug target/107456] std::atomic::fetch_xxx generate LOCK CMPXCHG instead of simpler LOCK instructions

2022-11-01 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107456 --- Comment #4 from Thiago Macieira --- (In reply to Thiago Macieira from comment #3) > With the Remote Atomic Operations (RAO) of AAND, AOR and AXOR, we can do > something. Correcting myself: the RAO instructions don't give us the result back

[Bug target/107456] std::atomic::fetch_xxx generate LOCK CMPXCHG instead of simpler LOCK instructions

2022-10-31 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107456 Thiago Macieira changed: What|Removed |Added CC||thiago at kde dot org --- Comment #3

[Bug c++/106395] New: [10/11 regression] [mingw] "redeclared without dllimport attribute: previous dllimport ignored" on C++ friend

2022-07-21 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106395 Bug ID: 106395 Summary: [10/11 regression] [mingw] "redeclared without dllimport attribute: previous dllimport ignored" on C++ friend Product: gcc Version:

[Bug c++/77306] Unable to specify visibility for explicit template instantiations

2022-06-19 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77306 Thiago Macieira changed: What|Removed |Added CC||thiago at kde dot org --- Comment #3

[Bug c++/106023] Would like to control the ELF visibility of template explicit instantiations

2022-06-19 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106023 Thiago Macieira changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|---

[Bug c++/106023] New: Would like to control the ELF visibility of template explicit instantiations

2022-06-18 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106023 Bug ID: 106023 Summary: Would like to control the ELF visibility of template explicit instantiations Product: gcc Version: 13.0 Status: UNCONFIRMED Severity:

[Bug middle-end/105348] Overly aggressive -Warray-bounds after conditional

2022-05-31 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105348 --- Comment #4 from Thiago Macieira --- One more Qt workaround, for the record: https://codereview.qt-project.org/c/qt/qtbase/+/413730

[Bug c++/105509] New: [compatibility] f16 suffix not supported in C++ mode - unable to find numeric literal operator ‘operator""f16’

2022-05-06 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105509 Bug ID: 105509 Summary: [compatibility] f16 suffix not supported in C++ mode - unable to find numeric literal operator ‘operator""f16’ Product: gcc Version:

[Bug middle-end/105348] Overly aggressive -Warray-bounds after conditional

2022-04-25 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105348 --- Comment #3 from Thiago Macieira --- I understand. I'm just trying to avoid having to add code for a corner-case. People don't usually parse empty buffers, so it's usually fine to allow it to proceed and discover an EOF condition. Anyway,

[Bug middle-end/105348] Overly aggressive -Warray-bounds after conditional

2022-04-22 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105348 --- Comment #1 from Thiago Macieira --- Qt workaround: https://codereview.qt-project.org/c/qt/qtbase/+/407217

[Bug middle-end/105348] New: Overly aggressive -Warray-bounds after conditional

2022-04-22 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105348 Bug ID: 105348 Summary: Overly aggressive -Warray-bounds after conditional Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3

[Bug target/103069] cmpxchg isn't optimized

2022-02-22 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103069 --- Comment #20 from Thiago Macieira --- I think there will be cases where the relaxation makes sense and others where it doesn't because the surrounding code already does it. So I'd like to control per emission. If I can't do it per code

[Bug target/103069] cmpxchg isn't optimized

2022-02-22 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103069 --- Comment #18 from Thiago Macieira --- (In reply to Jakub Jelinek from comment #17) > _Pragma("GCC target \"relax-cmpxchg-loop\"") > should do that (ditto target("relax-cmpxchg-loop") attribute). The attribute is applied to a function. I'm

[Bug target/103069] cmpxchg isn't optimized

2022-02-22 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103069 --- Comment #16 from Thiago Macieira --- Can this option be enabled and disabled with a _Pragma?

[Bug target/103069] cmpxchg isn't optimized

2022-02-21 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103069 --- Comment #14 from Thiago Macieira --- I'd restrict relaxations to loops emitted by the compiler. All other atomic operations shouldn't be modified at all, unless the user asks for it. That includes non-looping atomic operations (like LOCK

[Bug c++/104492] New: Bogus dangling pointer warning (dangling pointer to ‘candidates’ may be used [-Werror=dangling-pointer=])

2022-02-10 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104492 Bug ID: 104492 Summary: Bogus dangling pointer warning (dangling pointer to ‘candidates’ may be used [-Werror=dangling-pointer=]) Product: gcc Version: 12.0 Status:

[Bug c++/104475] New: Wstringop-overflow + atomics incorrect warning on dynamic object

2022-02-09 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104475 Bug ID: 104475 Summary: Wstringop-overflow + atomics incorrect warning on dynamic object Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal

[Bug c++/104243] Optimization requires __sync_synchronize

2022-01-27 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104243 --- Comment #7 from Thiago Macieira --- (In reply to Martin Liška from comment #6) > Anyway, upstream removed the pure attribute as we suggested: > https://codereview.qt-project.org/c/qt/qtbase/+/392357 Can we be assured the pure attribute

[Bug target/104250] New: [i386] GCC may want to use 32-bit (I)DIV if it can for 64-bit operands

2022-01-26 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104250 Bug ID: 104250 Summary: [i386] GCC may want to use 32-bit (I)DIV if it can for 64-bit operands Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal

[Bug target/103069] cmpxchg isn't optimized

2022-01-24 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103069 --- Comment #10 from Thiago Macieira --- (In reply to H.J. Lu from comment #9) > nptl/nptl_setxid.c in glibc has > > do > { > flags = THREAD_GETMEM (self, cancelhandling); > newval = THREAD_ATOMIC_CMPXCHG_VAL (self,

[Bug target/49001] GCC uses VMOVAPS/PD AVX instructions to access stack variables that are not 32-byte aligned

2021-12-21 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49001 --- Comment #7 from Thiago Macieira --- Hack to workaround: asm( ".macro vmovapd args:vararg\n" "vmovupd \\args\n" ".endm\n" ".macro vmovaps args:vararg\n" "vmovups \\args\n" ".endm\n" ".macro vmovdqa

[Bug target/103774] [i386] GCC should swap the arguments to certain functions to generate a single instruction

2021-12-20 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103774 Thiago Macieira changed: What|Removed |Added CC||hjl.tools at gmail dot com ---

[Bug target/103774] New: [i386] GCC should swap the arguments to certain functions to generate a single instruction

2021-12-20 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103774 Bug ID: 103774 Summary: [i386] GCC should swap the arguments to certain functions to generate a single instruction Product: gcc Version: 12.0 Status: UNCONFIRMED

[Bug target/103750] [i386] GCC schedules KMOV instructions that destroys performance in loop

2021-12-17 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103750 --- Comment #8 from Thiago Macieira --- Update again: looks like the issue was the next line I didn't paste, which was performing _kortestz_mask32_u8 on an __mmask16. The type mismatch was causing this problem. If I Use the correct

[Bug target/103750] [i386] GCC schedules KMOV instructions that destroys performance in loop

2021-12-17 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103750 --- Comment #7 from Thiago Macieira --- I should add the same is not happening for Char == char, meaning the returned type is an __mmask32 (unsigned) vmovdqu8(%rsi), %ymm2 vmovdqu832(%rsi), %ymm3 vpcmpub

[Bug target/103750] [i386] GCC schedules KMOV instructions that destroys performance in loop

2021-12-17 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103750 --- Comment #6 from Thiago Macieira --- It got worse. Now I'm seeing: .L807: vmovdqu16 (%rsi), %ymm2 vmovdqu16 32(%rsi), %ymm3 vpcmpuw $6, %ymm0, %ymm2, %k2 vpcmpuw $6, %ymm0, %ymm3, %k3

[Bug target/103750] [i386] GCC schedules KMOV instructions that destroys performance in loop

2021-12-17 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103750 --- Comment #5 from Thiago Macieira --- Maybe this is running afoul of GCC's thinking that a simple register-register move is free? I've seen it save a constant in an opmask register, but kmov{d,q} is not free like mov{l,q} is.

[Bug target/103750] New: [i386] GCC schedules KMOV instructions that destroys performance in loop

2021-12-16 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103750 Bug ID: 103750 Summary: [i386] GCC schedules KMOV instructions that destroys performance in loop Product: gcc Version: 12.0 Status: UNCONFIRMED Severity:

[Bug target/103066] __sync_val_compare_and_swap/__sync_bool_compare_and_swap aren't optimized

2021-11-06 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103066 --- Comment #10 from Thiago Macieira --- You're right that emitting more penalises those who have done their job and written proper code. The problem we're seeing is that such code appears to be the minority. Or, maybe put differently, the bad

[Bug target/103090] [i386] GCC should use the SF and ZF flags in some atomic_fetch_op sequences

2021-11-04 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103090 --- Comment #1 from Thiago Macieira --- One more: bool tsign3(std::atomic ) { // any two or more bits, so long as the sign bit is one of them // (or the compiler doesn't know what's in the variable) int bits = 1 | signbit;

[Bug target/103069] cmpxchg isn't optimized

2021-11-04 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103069 --- Comment #2 from Thiago Macieira --- See also bug 103090 for a few more (restricted) possibilities to replace a cmpxchg loop with a LOCK RMW operation.

[Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic

2021-11-04 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566 --- Comment #29 from Thiago Macieira --- New suggestion in bug 103090

[Bug middle-end/103090] New: [i386] GCC should use the SF and ZF flags in some atomic_fetch_op sequences

2021-11-04 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103090 Bug ID: 103090 Summary: [i386] GCC should use the SF and ZF flags in some atomic_fetch_op sequences Product: gcc Version: 12.0 Status: UNCONFIRMED Severity:

[Bug target/103069] cmpxchg isn't optimized

2021-11-03 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103069 --- Comment #1 from Thiago Macieira --- (the assembly doesn't match the source code, but we got your point) Another possible improvement for the __atomic_fetch_{and,nand,or} functions is that it can check whether the fetched value is already

[Bug libstdc++/101583] [12 Regression] error: use of deleted function when building gold

2021-10-14 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101583 Thiago Macieira changed: What|Removed |Added CC||thiago at kde dot org --- Comment

[Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic

2021-10-07 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566 --- Comment #26 from Thiago Macieira --- (In reply to H.J. Lu from comment #25) > Can you get some performance improvement data on real workloads? Will ask.

[Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic

2021-10-07 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566 --- Comment #24 from Thiago Macieira --- (In reply to H.J. Lu from comment #23) > I renamed the commit title. The new v3 is the v6 + fixes. Got it. Still no issues.

[Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic

2021-10-06 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566 --- Comment #22 from Thiago Macieira --- (In reply to H.J. Lu from comment #21) > Created attachment 51559 [details] > The new v3 patch > > The new v3 patch to check invalid mask. v3? We were already up to v6.

[Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic

2021-10-06 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566 --- Comment #20 from Thiago Macieira --- And: $ cat /tmp/test.cpp #include bool tbit(std::atomic ) { return i.fetch_xor(CONSTANT, std::memory_order_relaxed) & (CONSTANT); } $ ~/dev/gcc/bin/gcc "-DCONSTANT=(1LL<<63)" -S -o - -O2

[Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic

2021-10-05 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566 --- Comment #19 from Thiago Macieira --- (In reply to H.J. Lu from comment #17) > Created attachment 51558 [details] > The v6 patch > > Please try this. Confirmed for all inputs.

[Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic

2021-10-05 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566 --- Comment #15 from Thiago Macieira --- Works now for the failing case. Additionally: bool tbit(std::atomic ) { return i.fetch_and(~CONSTANT, std::memory_order_relaxed) & (CONSTANT); } Will properly produce LOCK BTR (CONSTANT=2):

[Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic

2021-10-04 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566 --- Comment #12 from Thiago Macieira --- Commit 7e0c0500808d58bca5b8e23cbd474022c32234e4 + your patch.

[Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic

2021-10-04 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566 --- Comment #11 from Thiago Macieira --- $ for ((i=0;i<32;++i)); do ~/dev/gcc/bin/gcc "-DCONSTANT=(1<<$i)" -S -o - -O2 /tmp/test.cpp | grep bts; done lock btsl $0, (%rdi) lock btsl $1, (%rdi) lock btsl

[Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic

2021-10-04 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566 --- Comment #9 from Thiago Macieira --- Looks like it doesn't work for the sign bit. $ cat /tmp/test.cpp #include bool tbit(std::atomic ) { return i.fetch_or(CONSTANT, std::memory_order_relaxed) & CONSTANT; } $ ~/dev/gcc/bin/gcc

[Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic

2021-10-04 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566 --- Comment #8 from Thiago Macieira --- $ cat /tmp/test.cpp #include bool tbit(std::atomic ) { return i.fetch_or(1, std::memory_order_relaxed) & 1; } $ ~/dev/gcc/bin/gcc -S -o - -O2 /tmp/test.cpp .file "test.cpp" .text

[Bug middle-end/102566] [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic

2021-10-04 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566 --- Comment #7 from Thiago Macieira --- (In reply to H.J. Lu from comment #5) > Created attachment 51536 [details] > A patch > > Please try this. Give me an hour (will try v2).

[Bug target/102566] New: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic

2021-10-02 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566 Bug ID: 102566 Summary: [i386] GCC should emit LOCK BTS for simple bit-test-and-set operations with std::atomic Product: gcc Version: unknown Status: UNCONFIRMED

[Bug target/102166] [i386] AMX intrinsics and macros not defined in C++

2021-09-01 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102166 --- Comment #9 from Thiago Macieira --- > clang defines them as intrinsic because they support AMX register allocation > (a lot of effort), gcc does not support AMX register allocation for now, and > defining them as intrinsic + builtin doesn't

[Bug target/102166] [i386] AMX intrinsics and macros not defined in C++

2021-09-01 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102166 --- Comment #6 from Thiago Macieira --- > I suggest doing as Clang did and make it an intrinsic. Or even a __builtin_ia32_markamxtile(); intrinsic, which produces the error if misused and does add the necessary bits to the .note.gnu.property

[Bug target/102166] [i386] AMX intrinsics and macros not defined in C++

2021-09-01 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102166 --- Comment #5 from Thiago Macieira --- (In reply to Hongtao.liu from comment #4) > Because _tile_loadd is implemented as embedded assembly plus macros, if > __AMX_TILE__ is removed, no error will be reported if the user does not use > the

[Bug target/102166] [i386] AMX intrinsics and macros not defined in C++

2021-09-01 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102166 --- Comment #3 from Thiago Macieira --- There appears to be some preprocessor magic behind the scenes because the preprocessed output can't be compiled either: $ gcc -no-integrated-cpp -Werror=implicit-function-declaration -c -xc test.cpp

[Bug target/102166] [i386] AMX intrinsics and macros not defined in C++

2021-09-01 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102166 --- Comment #2 from Thiago Macieira --- FYI: $ cat test.cpp #include __attribute__((target("avx"))) void avx() { _mm256_zeroall(); } #ifndef __INTEL_COMPILER __attribute__((target("amx-tile"))) #endif void amx() { _tile_loadd(0, 0,

[Bug target/102166] [i386] AMX intrinsics and macros not defined in C++

2021-09-01 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102166 Thiago Macieira changed: What|Removed |Added CC||hjl.tools at gmail dot com ---

[Bug target/102166] New: [i386] AMX intrinsics and macros not defined in C++

2021-09-01 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102166 Bug ID: 102166 Summary: [i386] AMX intrinsics and macros not defined in C++ Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3

[Bug libstdc++/99277] C++2a synchronisation is inefficient in GCC 11

2021-04-27 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99277 --- Comment #8 from Thiago Macieira --- This one is probably 12.0.

[Bug target/100005] undefined reference to `_rdrand64_step'

2021-04-12 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=15 --- Comment #14 from Thiago Macieira --- (In reply to Jakub Jelinek from comment #13) > The same like in C. > I.e. > extern inline __attribute__((gnu_inline, always_inline, artificial)) int foo > (int x) { return x; } > // The above is

[Bug target/100005] undefined reference to `_rdrand64_step'

2021-04-12 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=15 --- Comment #12 from Thiago Macieira --- (In reply to Richard Biener from comment #11) > Invalid. Note we can't really diagnose GNU extern inline address-taking > since > by definition that's allowed (just the definition needs to come from >

[Bug c/100005] undefined reference to `_rdrand64_step'

2021-04-09 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=15 --- Comment #6 from Thiago Macieira --- (In reply to Jakub Jelinek from comment #5) > then one would get an out of line copy when taking their address, but it > would > duplicated in all the TUs that did this. That's not a problem, since

[Bug c/100005] undefined reference to `_rdrand64_step'

2021-04-09 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=15 --- Comment #4 from Thiago Macieira --- That's an artificial (pun intended) limitation. In C++: template int fill_array(Generator generator, unsigned long long *rand_array) Also errors out with the same error, but works if you do:

[Bug c/100005] New: undefined reference to `_rdrand64_step'

2021-04-09 Thread thiago at kde dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=15 Bug ID: 15 Summary: undefined reference to `_rdrand64_step' Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c

  1   2   >