[Bug c++/116967] Accepts-invalid missing constinit specifier on initializing declaration
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116967 --- Comment #1 from Jan Schultke --- I really like the Clang output by the way, which GCC could copy almost directly: > :2:5: warning: 'constinit' specifier missing on initializing > declaration of 'x' [-Wmissing-constinit] > 2 | int x; > | ^ > | constinit > :1:8: note: variable declared constinit here > 1 | extern constinit int x; > |^ > 1 warning generated.
[Bug c++/116967] New: Accepts-invalid missing constinit specifier on initializing declaration
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116967 Bug ID: 116967 Summary: Accepts-invalid missing constinit specifier on initializing declaration Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: janschultke at googlemail dot com Target Milestone: --- extern constinit int x; int x; This TU violates https://eel.is/c++draft/dcl.constinit#1.sentence-2: > If the [constinit] specifier is applied to any declaration of a variable, > it shall be applied to the initializing declaration. However, GCC accepts this program with no warnings: https://godbolt.org/z/9naoPWKnq (unlike Clang and MSVC, which warn and error, respectively)
[Bug c++/116727] New: "this" is unusable in an explicit object member function lambda capturing this
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116727 Bug ID: 116727 Summary: "this" is unusable in an explicit object member function lambda capturing this Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: janschultke at googlemail dot com Target Milestone: --- struct awoo { void chan() { [this](this auto&& self) { this; }; } }; Clang correctly accepts this, GCC rejects it with the error (https://godbolt.org/z/rKbzxn863): > :4:13: error: invalid use of 'this' in non-member function > [-Wtemplate-body] > 4 | this; > | ^~~~ It seems like GCC is falsely rejecting the use of "this" here because the lambda's call operator is an explicit object member function. However, lambda bodies do not introduce a class scope (see http://eel.is/c++draft/expr.prim.this#note-1) and http://eel.is/c++draft/expr.prim.this#3.sentence-2 states: > It shall not appear within the declaration of a static or explicit object > member function of the current class ... Since there is no current class, it is legal to use "this" in principle, and the rules for what "this" does should be those stated in [expr.prim.lambda], as usual.
[Bug libstdc++/101485] Calling std::equal with std::byte* does not use memcmp
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101485 --- Comment #12 from Jan Schultke --- On a language evolution note, https://wg21.link/P2825 would let you detect whether an equality comparison for enumerations is overloaded by checking whether > declcall(E{} == E{}) ... is well-formed. If this makes it into C++26, the solution could be broadened to all enumerations robustly.
[Bug libstdc++/101485] Calling std::equal with std::byte* does not use memcmp
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101485 --- Comment #8 from Jan Schultke --- It is a tiny bit pessimistic if it uses std::convertible_to instead of std::__boolean_testable or what it was called. I cannot come up with an example that produces a false positive though (which is crucial for correctness), and if we cover 99% of enums, that's still much better than just std::byte. Nice job.
[Bug libstdc++/101485] Calling std::equal with std::byte* does not use memcmp
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101485 --- Comment #5 from Jan Schultke --- &operator==(E,E); is not a valid expression, but I understand what you're trying to do there. Perhaps you can test by converting to a function pointer bool(*)(E,E). It would surely miss cases like an operator== with an always-defaulted third parameter, or one where the return type is contextually convertible to bool, but not exactly bool, or: template T> bool operator==(T, T) { return false; } ... or other cases. You would really need some way to detect whether an expression (x == y) uses any overloaded operators, and I don't see how you could do that (without additional intrinsics).
[Bug libstdc++/101485] Calling std::equal with std::byte* does not use memcmp
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101485 Jan Schultke changed: What|Removed |Added CC||janschultke at googlemail dot com --- Comment #2 from Jan Schultke --- Would it be wrong to extend the definition to all enumeration types instead of just std::byte? I don't see what could be wrong about memcmping enumerations, given that this is correct for integers, and enumerations always have integer underlying types.
[Bug c++/115417] New: Destructor is noexcept even though the class has a throwing destructor subobject in an anonymous union
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115417 Bug ID: 115417 Summary: Destructor is noexcept even though the class has a throwing destructor subobject in an anonymous union Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: janschultke at googlemail dot com Target Milestone: --- Consider the following code: #include struct D { ~D() noexcept(false); }; struct B { union { char c; D d; }; ~B() {} }; static_assert(std::is_nothrow_destructible_v); This assertion passes only for GCC, but not for other compilers (https://godbolt.org/z/e5jEKzjfs). I believe GCC is wrong because 'd' is a potentially constructed subobject of B (https://eel.is/c++draft/except.spec#8) and therefore, a destructor without a noexcept-specifier should be potentially throwing. This bug could be related to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109768, but I don't think it is a duplicate.
[Bug tree-optimization/115147] exp2 with integer arguments could be translated into ldexp
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115147 --- Comment #1 from Jan Schultke --- I did some quick low-quality benchmarks. It doesn't seem to make any kind of difference for libc++ and clang: https://quick-bench.com/q/aq1mZ1sKTWHzdmZf5D7BO2yJ1Yo (or for libstdc++ and clang) For GCC and libstdc++, ldexp turns out to be substantially slower than exp2, which is very surprising: https://quick-bench.com/q/iqGdSMmsUIYNo8O8jC7Py8yOg84
[Bug tree-optimization/115147] New: exp2 with integer arguments could be translated into ldexp
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115147 Bug ID: 115147 Summary: exp2 with integer arguments could be translated into ldexp Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: janschultke at googlemail dot com Target Milestone: --- https://godbolt.org/z/9141MdG6b float e(int x) { return __builtin_exp2f(x); } GCC (-O2) optimizes this to: > e(int): > pxorxmm0, xmm0 > cvtsi2ssxmm0, edi > jmp exp2f Clang (-O2) optimizes this to: > .LCPI0_0: > .long 0x3f80 > e(int): > movss xmm0, dword ptr [rip + .LCPI0_0] > jmp ldexpf@PLT I believe translating to ldexp is usually better, and this is a missed optimization. ldexp takes an integer exponent. Converting x to float and then going through exp2 seems wasteful in this case.
[Bug c++/115085] Variable unqualified-id is falsely treated as rvalue when appearing in braced-init-list
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115085 --- Comment #4 from Jan Schultke --- https://github.com/cplusplus/CWG/issues/536
[Bug c++/115085] Variable unqualified-id is falsely treated as rvalue when appearing in braced-init-list
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115085 --- Comment #1 from Jan Schultke --- Another user suggested that this is caused by falsely performing temporary materialization. This would make a an xvalue, which would also make the reference binding fail.
[Bug c++/115085] New: Variable unqualified-id is falsely treated as rvalue when appearing in braced-init-list
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115085 Bug ID: 115085 Summary: Variable unqualified-id is falsely treated as rvalue when appearing in braced-init-list Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: janschultke at googlemail dot com Target Milestone: --- > int a{}, b = decltype((a)){a}; GCC falsely rejects this (https://godbolt.org/z/a6c4h8Mhv). If you're having trouble reading this, it also rejects: > using T = int&; > int a{}, b = T{a}; Splitting this into multiple lines is also not relevant. The error is: > :3:17: error: cannot bind non-const lvalue reference of type 'T' {aka > 'int&'} to an rvalue of type 'int' >3 | int a{}, b = T{a}; > | ^ There is no apparent reason why should be an rvalue in this context. It is not move-eligible within the initializer of a variable. My guess is something something aggregate initialization doing copy-initialization for each initializer and then getting a prvalue out of that. Dunno, it's quite weird. See https://stackoverflow.com/a/78477064/5740428 for a more in-depth explanation of the relevant wording.
[Bug c++/55004] [meta-bug] constexpr issues
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55004 Bug 55004 depends on bug 114219, which changed state. Bug 114219 Summary: [11/12/13/14 Regression] [expr.const] lvalue-to-rvalue conversion is not diagnosed to disqualify constant expressions for empty classes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114219 What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |INVALID
[Bug c++/114219] [11/12/13/14 Regression] [expr.const] lvalue-to-rvalue conversion is not diagnosed to disqualify constant expressions for empty classes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114219 Jan Schultke changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |INVALID --- Comment #5 from Jan Schultke --- Looks like the existing comments were right. Lvalue-to-rvalue conversion very rarely takes place for class types, and if it does, that may be a wording bug. Instead of lvalue-to-rvalue conversion, this case calls the implicitly-defined copy constructor, which can be used in constant expressions.
[Bug c++/114219] [11/12/13/14 Regression] [expr.const] lvalue-to-rvalue conversion is not diagnosed to disqualify constant expressions for empty classes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114219 --- Comment #4 from Jan Schultke --- I don't see how lvalue-to-rvalue conversion would be bypassed here. https://eel.is/c++draft/conv.lval#:conversion,lvalue-to-rvalue has no special provision for empty classes. https://eel.is/c++draft/dcl.init.general#16.9 would necessitate lvalue-to-rvalue conversion because the initializer has to be converted to a prvalue. I couldn't find any special rule for empty classes.
[Bug c++/114219] [11/12/13/14 Regression] [expr.const] lvalue-to-rvalue conversion is not diagnosed to disqualify constant expressions for empty classes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114219 --- Comment #2 from Jan Schultke --- Corresponding LLVM bug: https://github.com/llvm/llvm-project/issues/83712
[Bug c++/114219] New: [expr.const] lvalue-to-rvalue conversion is not diagnosed to disqualify constant expressions for empty classes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114219 Bug ID: 114219 Summary: [expr.const] lvalue-to-rvalue conversion is not diagnosed to disqualify constant expressions for empty classes Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: janschultke at googlemail dot com Target Milestone: --- https://godbolt.org/z/jhcP8WPn8 Code to reproduce = struct S {}; constexpr int f(S s) { constexpr S _ = s; return 0; } S s; constexpr int _ = f(s); Issue description = This code compiles, but should produce errors for both constexpr initializations. According to [expr.const] p5.9, the initializers of _ cannot be constant expression because they contain lvalue-to-rvalue conversion of s, whose lifetime did not begin within the evaluation of the constant expression, and which is not usable in constant expressions. It shouldn't matter whether S is an empty class or has members; the lvalue-to-rvalue conversion in itself disqualifies expressions from being constant expressions.
[Bug middle-end/114086] Boolean switches could have a lot better codegen, possibly utilizing bit-vectors
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114086 --- Comment #5 from Jan Schultke --- Well, it's not quite equivalent to either of the bit-shifts we've posted. To account for shifting more than the operand size, it would be: bool foo (int x) { return x > 6 ? 0 : ((85 >> x) & 1); } This is exactly what GCC does and the branch can be explained by this range check. So I guess GCC already does optimize this to a bit-vector, it just doesn't find the optimization to: bool foo(int x) { return (x & -7) == 0; } This is very specific to this particular switch statement though. You could do better than having a branch if the hardware supported a saturating shift, but probably not on x86_64. Nevermind that; if anything, this isn't middle-end.
[Bug middle-end/114086] Boolean switches could have a lot better codegen, possibly utilizing bit-vectors
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114086 --- Comment #2 from Jan Schultke --- Yeah right, the actual optimal output (which clang finds) is: > test_switch(E): > test edi, -7 > sete al > ret Testing with -7 also makes sure that the 8-bit and greater are all zero.
[Bug middle-end/114086] New: Boolean switches could have a lot better codegen, possibly utilizing bit-vectors
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114086 Bug ID: 114086 Summary: Boolean switches could have a lot better codegen, possibly utilizing bit-vectors Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: janschultke at googlemail dot com Target Milestone: --- https://godbolt.org/z/3acqbbn3E enum struct E { a, b, c, d, e, f, g, h }; bool test_switch(E e) { switch (e) { case E::a: case E::c: case E::e: case E::g: return true; default: return false; } } Expected output === test_switch(E): mov eax, edi and eax, 1 ret Actual output (-O3) === test_switch(E): xor eax, eax cmp edi, 6 ja .L1 mov eax, 85 bt rax, rdi setc al .L1: ret Explanation === Boolean switches in general can be optimized a lot better than what GCC currently does. Clang does find the optimization to a bitwise AND, although this may be a big ask. Generally, contiguous boolean switches (that is, switch statements where all cases yield a boolean value and the labels are contiguous) can be optimized to accessing a bit vector. That switch could have been transformed into: > return 0b01010101 >> int(e);
[Bug c++/114006] New: False positive diagnostic -Wpedantic for zero-size arrays, most vexing parse
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114006 Bug ID: 114006 Summary: False positive diagnostic -Wpedantic for zero-size arrays, most vexing parse Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: janschultke at googlemail dot com Target Milestone: --- I got a false positive warning when compiling LLVM with g++. Here is a minimal repro: struct string { const char* data; string operator+(const char*); }; int LLVMFuzzerInitialize(int*, char***argv) { string ExitOnErr(string(*argv[0]) + ": error:"); return 0; } : In function 'int LLVMFuzzerInitialize(int*, char***)': :7:35: warning: ISO C++ forbids zero-size array 'argv' [-Wpedantic] 7 | string ExitOnErr(string(*argv[0]) + ": error:"); | It looks like GCC thinks that this is a most vexing parse; i.e. it thinks that argv[0] is a declarator, not a subscript operator. This cannot be correct because the next expression is + ": error" so this cannot be parsed as a function declaration. I suspect that the diagnostic for zero-size arrays is prematurely emitted, before it's actually known whether this is a function declaration or not.
[Bug middle-end/113988] during GIMPLE pass: bitintlower: internal compiler error: in lower_stmt, at gimple-lower-bitint.cc:5470
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113988 --- Comment #2 from Jan Schultke --- Oh yeah, I should have noted that this only happens for AVX-512 targets. Changing -march=znver4 to -march=znver3 stops the ICE.
[Bug c++/113988] New: during GIMPLE pass: bitintlower: internal compiler error: in lower_stmt, at gimple-lower-bitint.cc:5470
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113988 Bug ID: 113988 Summary: during GIMPLE pass: bitintlower: internal compiler error: in lower_stmt, at gimple-lower-bitint.cc:5470 Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: janschultke at googlemail dot com Target Milestone: --- https://godbolt.org/z/3ez1erYa3 unsigned rem_fast(_BitInt(512) x, unsigned y) { const int size = sizeof(x) / 4; unsigned digits[size]; __builtin_memcpy(digits, &x, sizeof(digits)); unsigned rem = 0; for (int i = 0; i < size; ++i) { unsigned long long temp = (unsigned long long) rem << 32 | digits[size - i - 1]; rem = temp % y; } return rem; } during GIMPLE pass: bitintlower : In function 'rem_fast': :1:10: internal compiler error: in lower_stmt, at gimple-lower-bitint.cc:5470 1 | unsigned rem_fast(_BitInt(512) x, unsigned y) { | ^~~~ 0x233bebc internal_error(char const*, ...) ???:0 0x96c2ff fancy_abort(char const*, int, char const*) ???:0
[Bug c++/113982] New: Poor codegen for 64-bit add with carry widening functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113982 Bug ID: 113982 Summary: Poor codegen for 64-bit add with carry widening functions Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: janschultke at googlemail dot com Target Milestone: --- I was trying to get optimal codegen for a 64-bit addition with a carry, but it's tough to do with GCC: > struct add_result { > unsigned long long sum; > bool carry; > }; > > add_result add_wide_1(unsigned long long x, unsigned long long y) { > auto r = (unsigned __int128) x + y; > return add_result{static_cast(r), bool(r >> 64)}; > } > > add_result add_wide_2(unsigned long long x, unsigned long long y) { > unsigned long long r; > bool carry = __builtin_add_overflow(x, y, &r); > return add_result{r, carry}; > } ## Expected output (clang -march=x86-64-v4 -O3) add_wide_1(unsigned long long, unsigned long long): mov rax, rdi add rax, rsi setbdl ret add_wide_2(unsigned long long, unsigned long long): mov rax, rdi add rax, rsi setbdl ret ## Actual output (GCC -march=x86-64-v4 -O3) (https://godbolt.org/z/qGc9WeEvK) add_wide_1(unsigned long long, unsigned long long): mov rcx, rdi lea rax, [rdi+rsi] xor edx, edx xor edi, edi add rsi, rcx adc rdi, 0 mov dl, dil and dl, 1 ret add_wide_2(unsigned long long, unsigned long long): add rdi, rsi mov edx, 0 mov rax, rdi setcdl ret The output for the 128-bit version looks pretty bad. It looks like GCC isn't aware that we only access the carry bit, so it doesn't need to do full 128-bit arithmetic so to speak. The add_wide_2 output also isn't optimal. Why would it output "mov edx, 0" instead of "xor edx, edx"?
[Bug c++/113914] New: GCC accepts user-defined integer-literal that does not fit in any type
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113914 Bug ID: 113914 Summary: GCC accepts user-defined integer-literal that does not fit in any type Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: janschultke at googlemail dot com Target Milestone: --- > int operator""_zero(unsigned long long); > int x = 0x1_zero; This code is ill-formed but GCC does not emit a diagnostic (https://godbolt.org/z/r9KrGGafY). Note that as per https://eel.is/c++draft/lex.ext#3, this is treated like a call: > operator""_zero(0x1000...000ULL) However, the ULL-suffixed integer-literal would be ill-formed. Clang reject this.
[Bug c++/113821] New: Missed optimization for final classes: expensive check for result of dynamic cast
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113821 Bug ID: 113821 Summary: Missed optimization for final classes: expensive check for result of dynamic cast Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: janschultke at googlemail dot com Target Milestone: --- Code to reproduce (https://godbolt.org/z/48cabYT78) === struct Base { virtual ~Base() = default; }; struct Derived final : Base {}; bool is_derived(Base& a) { return dynamic_cast(&a); } Expected output (clang) === is_derived(Base&): lea rax, [rip + vtable for Derived+16] cmp qword ptr [rdi], rax seteal ret Actual output (GCC) === is_derived(Base&): sub rsp, 8 xor ecx, ecx mov edx, OFFSET FLAT:typeinfo for Derived mov esi, OFFSET FLAT:typeinfo for Base call__dynamic_cast testrax, rax setne al add rsp, 8 ret Explanation === For final classes, checking for success of dynamic_cast is equivalent to checking whether the vtable pointer equals that of the destination type. GCC is overly pessimistic by calling __dynamic_cast, which is much more expensive, I'd imagine. Clang emits the same kind of pessimistic code only when Derived is not final.
[Bug c++/113745] New: Poor diagnostics quality for resize() without a default-constructible type
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113745 Bug ID: 113745 Summary: Poor diagnostics quality for resize() without a default-constructible type Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: janschultke at googlemail dot com Target Milestone: --- ## Code to reproduce (https://godbolt.org/z/6ETnffr8c) #include struct non_default_constructible { non_default_constructible(int) {} }; int main() { std::vector v; v.resize(0); } ## Diagnostic In file included from /opt/compiler-explorer/gcc-trunk-20240203/include/c++/14.0.1/bits/stl_iterator.h:78, from /opt/compiler-explorer/gcc-trunk-20240203/include/c++/14.0.1/bits/stl_algobase.h:67, from /opt/compiler-explorer/gcc-trunk-20240203/include/c++/14.0.1/vector:62, from :1: /opt/compiler-explorer/gcc-trunk-20240203/include/c++/14.0.1/bits/stl_construct.h: In instantiation of 'constexpr void std::_Construct(_Tp*, _Args&& ...) [with _Tp = non_default_constructible; _Args = {}]': /opt/compiler-explorer/gcc-trunk-20240203/include/c++/14.0.1/bits/stl_uninitialized.h:643:18: required from 'static constexpr _ForwardIterator std::__uninitialized_default_n_1<_TrivialValueType>::__uninit_default_n(_ForwardIterator, _Size) [with _ForwardIterator = non_default_constructible*; _Size = long unsigned int; bool _TrivialValueType = false]' 643 | std::_Construct(std::__addressof(*__cur)); | ~~~^~ /opt/compiler-explorer/gcc-trunk-20240203/include/c++/14.0.1/bits/stl_uninitialized.h:701:22: required from 'constexpr _ForwardIterator std::__uninitialized_default_n(_ForwardIterator, _Size) [with _ForwardIterator = non_default_constructible*; _Size = long unsigned int]' 700 | return __uninitialized_default_n_1:: | 701 | __uninit_default_n(__first, __n); | ~~^~ /opt/compiler-explorer/gcc-trunk-20240203/include/c++/14.0.1/bits/stl_uninitialized.h:779:44: required from 'constexpr _ForwardIterator std::__uninitialized_default_n_a(_ForwardIterator, _Size, allocator<_Tp>&) [with _ForwardIterator = non_default_constructible*; _Size = long unsigned int; _Tp = non_default_constructible]' 779 | { return std::__uninitialized_default_n(__first, __n); } | ~~^~ /opt/compiler-explorer/gcc-trunk-20240203/include/c++/14.0.1/bits/vector.tcc:821:35: required from 'constexpr void std::vector<_Tp, _Alloc>::_M_default_append(size_type) [with _Tp = non_default_constructible; _Alloc = std::allocator; size_type = long unsigned int]' 821 | std::__uninitialized_default_n_a(this->_M_impl._M_finish, | ^ 822 | __n, _M_get_Tp_allocator()); | ~~~ /opt/compiler-explorer/gcc-trunk-20240203/include/c++/14.0.1/bits/stl_vector.h:1013:4: required from 'constexpr void std::vector<_Tp, _Alloc>::resize(size_type) [with _Tp = non_default_constructible; _Alloc = std::allocator; size_type = long unsigned int]' 1013 | _M_default_append(__new_size - size()); | ^ :9:13: required from here 9 | v.resize(0); | ^~~ /opt/compiler-explorer/gcc-trunk-20240203/include/c++/14.0.1/bits/stl_construct.h:115:28: error: no matching function for call to 'construct_at(non_default_constructible*&)' 115 | std::construct_at(__p, std::forward<_Args>(__args)...); | ~^ /opt/compiler-explorer/gcc-trunk-20240203/include/c++/14.0.1/bits/stl_construct.h:94:5: note: candidate: 'template constexpr decltype (::new(void*(0)) _Tp) std::construct_at(_Tp*, _Args&& ...)' 94 | construct_at(_Tp* __location, _Args&&... __args) | ^~~~ /opt/compiler-explorer/gcc-trunk-20240203/include/c++/14.0.1/bits/stl_construct.h:94:5: note: template argument deduction/substitution failed: /opt/compiler-explorer/gcc-trunk-20240203/include/c++/14.0.1/bits/stl_construct.h: In substitution of 'template constexpr decltype (::new(void*(0)) _Tp) std::construct_at(_Tp*, _Args&& ...) [with _Tp = non_default_constructible; _Args = {}]': /opt/compiler-explorer/gcc-trunk-20240203/include/c++/14.0.1/bits/stl_construct.h:115:21: required from 'constexpr void std::_Construct(_Tp*, _Args&& ...) [with _Tp = non_default_constructible; _Args = {}]' /opt/compiler-explorer/gcc-trunk-20240203/include/c++/14.0.1/bits/stl_construct.h
[Bug c++/113581] Ignoring GCC unroll loop annotation for loops with increment in condition
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113581 --- Comment #1 from Jan Schultke --- Oh, I probably should have mentioned this: This only happens when times_three is a function template.
[Bug c++/113581] New: Ignoring GCC unroll loop annotation for loops with increment in condition
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113581 Bug ID: 113581 Summary: Ignoring GCC unroll loop annotation for loops with increment in condition Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: janschultke at googlemail dot com Target Milestone: --- Code to reproduce (https://godbolt.org/z/r6EcW7ocW) === template int times_three(T x) { int sum = 0; #pragma GCC unroll 16 for (int i = 0; i++ < 3;) { sum += x; } return sum; } int main() { times_three(123u); } Output == > : In function 'int times_three(T) [with T = unsigned int]': > :5:19: warning: ignoring loop annotation > 5 | for (int i = 0; i++ < 3;) { > | ^ Explanation === It looks like GCC doesn't like loops that have an increment in the condition. Changing the loop to the following solves this issue: > for (int i = 0; i < 3; i++) For increment, this is perhaps not a big deal. I've run into this bug when working with a loop of the form: > for (int i = N; i-- > N;) This is a popular idiom for decrementing with i in range [0, N).
[Bug c++/113565] __builtin_clz(0) is undefined behavior, but not detected in constant expressions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113565 --- Comment #5 from Jan Schultke --- You can reproduce this as follows: > static_assert(__builtin_clz(0u) == 32); > > unsigned x = 0; > > int main() { > return __builtin_clz(x); > } For base x86_64, GCC emits: (https://godbolt.org/z/nqzYrTWd1) > main: > bsr eax, DWORD PTR x[rip] > xor eax, 31 > ret > x: > .zero 4 Even though __builtin_clz(0u) == 32 passes, this program returns 63. This is obviously not in the interest of the developer, regardless of what the standard mandates.
[Bug c++/113565] __builtin_clz(0) is undefined behavior, but not detected in constant expressions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113565 --- Comment #4 from Jan Schultke --- I would expect an error here because things that are undefined behavior are generally supposed to fail in constant expressions. I don't see a good reason why builtins should be exempt from that rule. The lack of diagnostic has cost me a few minutes of debugging yesterday. I had a static_assert: > static_assert(my_function(0u) == 32); This succeeded at compile time, but my_function(0) would return 0 at run-time as a result of passing through to __builtin_clz. UBSan may have caught it, but regardless, it's not sane to have different results inside/outside constant expressions like that.
[Bug c++/113565] New: __builtin_clz(0) is undefined behavior, but not detected in constant expressions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113565 Bug ID: 113565 Summary: __builtin_clz(0) is undefined behavior, but not detected in constant expressions Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: janschultke at googlemail dot com Target Milestone: --- Code to reproduce = static_assert(__builtin_clz(0)); Actual output (https://godbolt.org/z/v6v5nGxv8) === None. Expected output (Clang) === :1:15: error: static assertion expression is not an integral constant expression 1 | static_assert(__builtin_clz(0)); | ^~~~
[Bug c++/113564] New: ICE internal compiler error when calling a concept as a function in a template
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113564 Bug ID: 113564 Summary: ICE internal compiler error when calling a concept as a function in a template Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: janschultke at googlemail dot com Target Milestone: --- https://godbolt.org/z/nG9GGzj44 -- template concept tru = true; template void awoo(T x) { tru(x); } template void awoo(unsigned); -- : In instantiation of 'void awoo(T) [with T = unsigned int]': :9:28: required from here 9 | template void awoo(unsigned); |^ :6:11: internal compiler error: in tsubst_expr, at cp/pt.cc:20985 6 | tru(x); | ~~^~~ 0x264416c internal_error(char const*, ...) ???:0 0xa5086d fancy_abort(char const*, int, char const*) ???:0 0xc89903 instantiate_decl(tree_node*, bool, bool) ???:0 0xcb398b instantiate_pending_templates(int) ???:0 0xb53299 c_parse_final_cleanups() ???:0 0xda6478 c_common_parse_file() ???:0
[Bug c++/113543] New: Poor codegen for bit-counting functions (countl_zero, countl_one, countr_zero, countr_one)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113543 Bug ID: 113543 Summary: Poor codegen for bit-counting functions (countl_zero, countl_one, countr_zero, countr_one) Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: janschultke at googlemail dot com Target Milestone: --- ## Code to Reproduce (https://godbolt.org/z/qPeszhaPv) #include template T countr_zero(T x) { return std::countr_zero(x); } template unsigned char countr_zero(unsigned char); template unsigned short countr_zero(unsigned short); template unsigned int countr_zero(unsigned int); template unsigned long countr_zero(unsigned long); template unsigned long long countr_zero(unsigned long long); template T countr_one(T x) { return std::countr_one(x); } template unsigned char countr_one(unsigned char); template unsigned short countr_one(unsigned short); template unsigned int countr_one(unsigned int); template unsigned long countr_one(unsigned long); template unsigned long long countr_one(unsigned long long); template T countl_zero(T x) { return std::countl_zero(x); } template unsigned char countl_zero(unsigned char); template unsigned short countl_zero(unsigned short); template unsigned int countl_zero(unsigned int); template unsigned long countl_zero(unsigned long); template unsigned long long countl_zero(unsigned long long); template T countl_one(T x) { return std::countl_zero(x); } template unsigned char countl_one(unsigned char); template unsigned short countl_one(unsigned short); template unsigned int countl_one(unsigned int); template unsigned long countl_one(unsigned long); template unsigned long long countl_one(unsigned long long); ## Summary GCC consistently emits much more code for these function than clang. For example, GCC: > unsigned int countl_one(unsigned int): > xor eax, eax > lzcnt eax, edi > ret Clang does not emit the extra xor instruction. I don't really know why. LZCNT has a wide contract and should be equivalent to std::countl_zero. It gets a lot worse though: > unsigned short countl_zero(unsigned short): > mov eax, 16 > testdi, di > je .L23 > movzx edi, di > lzcnt edi, edi > lea eax, [rdi-16] > .L23: > ret I don't really know what all of this schmutz is. Clang emits lzcnt and ret in this case. Another bit of disappointing codegen is this: > unsigned char countr_zero(unsigned char): > movzx eax, dil > xor edx, edx > tzcnt edx, eax > testdil, dil > mov eax, 8 > cmovne eax, edx > ret Clang emits: > or edi, 256 > tzcnt eax, edi > ret This clang codegen is very clever. It simply adds a bit on the left, so that the 32-bit routine can be re-used with only one additional instruction.
[Bug libstdc++/113386] [C++23] std::pair comparison operators should be transparent, but are not in libstdc++
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113386 --- Comment #7 from Jan Schultke --- I've noticed that too by now. What confuses me is that both libc++ and MSVC STL implement it as if it was a DR, so transparent comparisons work even outside C++23 mode. Is it just a collective mistake, or what's going on with that? What would be the right way to implement it in libstdc++ then?
[Bug libstdc++/113386] [C++23] std::pair comparison operators should be transparent, but are not in libstdc++
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113386 --- Comment #5 from Jan Schultke --- My bad again, it's a defect report, so cppreference is fine.
[Bug libstdc++/113386] [C++23] std::pair comparison operators should be transparent, but are not in libstdc++
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113386 --- Comment #4 from Jan Schultke --- My bad. https://en.cppreference.com/w/cpp/utility/pair/operator_cmp currently shows > template< class T1, class T2, class U1, class U2 > > bool operator==( const std::pair& lhs, const std::pair& rhs ); > (until C++14) I'll fix this page. Never trust cppreference blindly I guess :)
[Bug libstdc++/113386] std::pair comparison operators should be transparent, but are not in libstdc++
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113386 --- Comment #1 from Jan Schultke --- https://godbolt.org/z/9x9n4bGKK
[Bug libstdc++/113386] New: std::pair comparison operators should be transparent, but are not in libstdc++
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113386 Bug ID: 113386 Summary: std::pair comparison operators should be transparent, but are not in libstdc++ Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: janschultke at googlemail dot com Target Milestone: --- ## Code to reproduce #include bool equals(const std::pair& a, const std::pair& b) { return a == b; } ## Explanation Clang with -stdlib=libc++ compiles this, as does MSVC. Bug #90203 was incorrectly closed. std::pair comparison operators should be transparent, see https://eel.is/c++draft/pairs.spec The standard requires the signature: > template > constexpr bool operator==(const pair& x, const pair& y); libstdc++ incorrectly implements this with only two template parameters: > template > inline _GLIBCXX_CONSTEXPR bool > operator==(const pair<_T1, _T2>& __x, const pair<_T1, _T2>& __y) > { return __x.first == __y.first && __x.second == __y.second; }
[Bug c++/113274] Memoization in template parameters is overly aggressive; false memoization for const pointers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113274 --- Comment #2 from Jan Schultke --- OOPS, I've messed up the repro. It should be true in the partial specialization. https://godbolt.org/z/11dW3cTfc
[Bug c++/113274] Memoization in template parameters is overly aggressive; false memoization for const pointers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113274 --- Comment #1 from Jan Schultke --- Original problem and more discussion: https://stackoverflow.com/q/4976/5740428
[Bug c++/113274] New: Memoization in template parameters is overly aggressive; false memoization for const pointers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113274 Bug ID: 113274 Summary: Memoization in template parameters is overly aggressive; false memoization for const pointers Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: janschultke at googlemail dot com Target Milestone: --- ## Minimal Reproducible Example (https://godbolt.org/z/Y7Kr9o546) template struct A { static constexpr bool value = false; }; template requires __is_same(decltype(p), int*) struct A { static constexpr bool value = false; }; int x = 0; //static_assert( A<&x>::value ); static_assert( A(&x)>::value == false ); ## Explanation Uncommenting the first static_assert causes compilation failure of the second static_assert. This should definitely not happen, as the following instantiations should be distinct: - A<(int*) &x> - A<(const int*) &x> GCC aggressively memoizes the first instantiation in A<&x>, which results in the subsequent A<(const int*)&x>::value being identical, even though it should not be.
[Bug c++/113242] New: g++ rejects-valid template argument of class type containing an lvalue reference
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113242 Bug ID: 113242 Summary: g++ rejects-valid template argument of class type containing an lvalue reference Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: janschultke at googlemail dot com Target Milestone: --- ## Code to Reproduce struct wrapper { int& ref; constexpr wrapper(int& ref) : ref(ref) {} }; template void fun1() {} template void fun2() { fun1(); } int main() { static int val = 22; fun2(); } ## Incorrect Output (GCC 14, -std=c++20) (https://godbolt.org/z/1jzqza73z) : In instantiation of 'void fun2() [with wrapper X = wrapper{val}]': :16:14: required from here 16 | fun2(); | ~^~ :11:12: error: no matching function for call to 'fun1()' 11 | fun1(); | ~~~^~ :7:6: note: candidate: 'template void fun1()' 7 | void fun1() {} | ^~~~ :7:6: note: template argument deduction/substitution failed: :11:12: error: the address of 'wrapper{val}' is not a valid template argument 11 | fun1(); | ~~~^~ ## Explanation None. Clang compiles this, but GCC doesn't. The reference contained within wrapper is a valid template argument (see https://eel.is/c++draft/temp.arg.nontype#6), but falsely disqualifies X in fun1 from binding to X in fun2. See https://stackoverflow.com/a/77764351/5740428 for a more detailed explanation for the relevant standardese.
[Bug c++/111662] Rejects valid: cv-qualifiers are not removed from function parameters of variadic templated function types
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111662 --- Comment #2 from Jan Schultke --- Bug was originally discovered here: https://stackoverflow.com/questions/77214665/problem-creating-template-function-alias-with-const-value-template-arguments/77215223#77215223
[Bug c++/111662] Rejects valid: cv-qualifiers are not removed from function parameters of variadic templated function types
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111662 --- Comment #1 from Jan Schultke --- See https://godbolt.org/z/Kaf7jETaY
[Bug c++/111662] New: Rejects valid: cv-qualifiers are not removed from function parameters of variadic templated function types
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111662 Bug ID: 111662 Summary: Rejects valid: cv-qualifiers are not removed from function parameters of variadic templated function types Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: janschultke at googlemail dot com Target Milestone: --- ## Code to reproduce template void f(Ts...) {} void(*pfi)(int) = &f; // OK void(*pfci)(int) = &f; // error void(*pcfi)(const int) = &f; // OK void(*pcfci)(const int) = &f; // error > :5:21: error: no matches converting function 'f' to type 'void > (*)(int)' > 5 | void(*pfci)(int) = &f; // error > | ^~~~ > :2:6: note: candidate is: 'template void f(Ts ...)' > 2 | void f(Ts...) {} > | ^ ## Explanation The error is nonsensical because the type of &f is already void(*)(int). No conversion is required. According to [dcl.fct] p5, cv-qualifiers are not part of the function type: > After producing the list of parameter types, > any top-level cv-qualifiers modifying a parameter type are deleted > when forming the function type. Variadic function templates are not exempt from this rule, and GCC should not reject this code. Making f non-templated, or using a single T parameter instead of a Ts... parameter pack makes this code compile.
[Bug c++/111277] New: braced-init-list allowed in a template-argument
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111277 Bug ID: 111277 Summary: braced-init-list allowed in a template-argument Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: janschultke at googlemail dot com Target Milestone: --- ## Code to reproduce (https://godbolt.org/z/ds7zo5Yeo) #include template struct ArrayPrimitive { constexpr ArrayPrimitive(const ValueType (&array)[Size]) {} }; template using ignore = void; using i = ignore<{{1, 2, 3}}>; ## Explanation A braced-init-list currently cannot appear in a template-argument. This is a known defect (https://cplusplus.github.io/CWG/issues/2450.html), but it is the status quo. GCC rejects ignore<{1, 2, 3}>, but it does not reject ignore<{{1, 2, 3}}>. I suspect it fails to recognize the latter as a braced-init-list for whatever reason. This bug was discovered here: https://stackoverflow.com/q/77032755/5740428
[Bug c++/57905] braced-init-list allowed in default template-argument
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57905 Jan Schultke changed: What|Removed |Added CC||janschultke at googlemail dot com --- Comment #1 from Jan Schultke --- Related: CWG 2450. braced-init-list as a template-argument https://cplusplus.github.io/CWG/issues/2450.html More related discussion: https://stackoverflow.com/q/77032755/5740428
[Bug c++/111174] New: G++ allows re-declaring function parameters as functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=74 Bug ID: 74 Summary: G++ allows re-declaring function parameters as functions Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: janschultke at googlemail dot com Target Milestone: --- ## Code to Reproduce void foo(int x) { void x(int); } ## Expected Output The expected output is emitted in C mode, and also by clang in C/C++ mode (https://godbolt.org/z/b8Gbj69Kn) : In function 'foo': :2:10: error: 'x' redeclared as different kind of symbol 2 | void x(int); | ^ :1:14: note: previous definition of 'x' with type 'int' 1 | void foo(int x) { | ^ ## Actual Output (only in C++ mode) Compiles. ## Explanation There is no wording in the standard that allows function declarations to redeclare function parameters. This erroneous behavior only occurs for redeclaring function parameters; local/global variables seem unaffected.
[Bug c++/111173] G++ allows constinit functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=73 --- Comment #2 from Jan Schultke --- I think the problem is that GCC treats "constinit" exactly like "const" for the purpose of diagnostics. In https://eel.is/c++draft/dcl.fct#11, it is said that const applied to functions is ignored. GCC produces error messages like: > :1:1: error: 'constinit' on function return type is not allowed > 1 | constinit void foo(); > | ^ This does not make any sense; "constinit" wouldn't apply to the function return type in the first place, but to the declarator-id. See https://eel.is/c++draft/dcl.meaning.general#4
[Bug c++/111173] New: G++ allows constinit functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=73 Bug ID: 73 Summary: G++ allows constinit functions Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: janschultke at googlemail dot com Target Milestone: --- ## Code to Reproduce using Function = int(); constinit Function f; ## Expected Output (provided by clang trunk) :2:1: error: constinit can only be used in variable declarations 2 | constinit Function f; | ^ 1 error generated. ## Actual Output (https://godbolt.org/z/7rdEhhj1s) This code compiles with any version of GCC that supports C++20 from what I could tell. constinit looks to have no effect.
[Bug c++/111079] New: Failing to reject a defaulted/deleted local function definition if it is a friend of a local class
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111079 Bug ID: 111079 Summary: Failing to reject a defaulted/deleted local function definition if it is a friend of a local class Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: janschultke at googlemail dot com Target Milestone: --- ## Code to Reproduce A (https://godbolt.org/z/ar34vn5dT) void foo() { struct A; bool operator ==(const A&, const A&); struct A { friend bool operator ==(const A&, const A&); }; bool operator ==(const A&, const A&) = default; } ## Code to Reproduce B (https://godbolt.org/z/sdo9fhacb) void foo() { struct A; bool bar(const A&, const A&); struct A { friend bool bar(const A&, const A&); }; bool bar(const A&, const A&) = delete; } ## Expected Output Compiler error. ## Actual Output : In function 'void foo()': :7:40: warning: declaration of 'bool operator==(const foo()::A&, const foo()::A&)' has 'extern' and is initialized 7 | bool operator ==(const A&, const A&) = default; | ## Explanation According to [dcl.fct.def.general] p2: > A function shall be defined only in namespace or class scope. There is a definition of `operator==` at block scope, which is obviously invalid. Operator overloads are also considered functions ([over.oper.general]), so they should be subject to the same rules.
[Bug libstdc++/110945] std::basic_string::assign dramatically slower than other means of copying memory
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110945 --- Comment #8 from Jan Schultke --- (In reply to Jonathan Wakely from comment #4) > Please provide the testcase in a usable form, not just a link to an external > site (which uses its own custom benchmark macros). This is requested at > https://gcc.gnu.org/ Thanks, I will remember to do that in the future. > This is not equivalent to the other forms of copying in the benchmark, > because string::assign has to handle possible aliasing. It's valid to do > things like str.assign(str.data()+1, str.data()+2). >From what I could read in the `char_traits::move` code that presumably gets called, this function explicitly tests for overlap between the memory regions, and dispatches to cheap functions if possible. The input size was 8 MiB, so it is unlikely that the overhead from this overlap detection is contributing in any relevant capacity. Basically, due to this overlap testing, `assign` SHOULD be just as fast as other methods if there is no overlap (and in this case, there clearly is none). However, it is 14x slower, so something is off. Either I haven't followed the logic correctly, or there is a mistake in this dispatching logic which leads to much worse performance for .assign().
[Bug libstdc++/110945] std::basic_string::assign dramatically slower than other means of copying memory
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110945 --- Comment #3 from Jan Schultke --- When increasing the input size to 8 MiB, the results become more similar to what clang delivers for 1 MiB too: https://quick-bench.com/q/DFHYW6eZq-FAif8xuLkBOPwzYWA
[Bug libstdc++/110945] std::basic_string::assign dramatically slower than other means of copying memory
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110945 --- Comment #2 from Jan Schultke --- Also it looks like GCC doesn't emit memcpy or memmove in either of the first benchmarks. Those statements refer to the corresponding clang output, actually. What's consistent for both compilers is that .assign() is dramatically slower than any other method.
[Bug libstdc++/110945] std::basic_string::assign dramatically slower than other means of copying memory
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110945 Jan Schultke changed: What|Removed |Added Keywords||missed-optimization --- Comment #1 from Jan Schultke --- Oops, I meant that it calls __builtin_memmove(). Well, neither memmove nor memcpy are visible in the output.
[Bug libstdc++/110945] New: std::basic_string::assign dramatically slower than other means of copying memory
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110945 Bug ID: 110945 Summary: std::basic_string::assign dramatically slower than other means of copying memory Product: gcc Version: 12.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: janschultke at googlemail dot com Target Milestone: --- See https://quick-bench.com/q/bqGjfyd180oOlJhiY_XnURMNKG8 Using the copy constructor performs best, and ends up using std::memcpy internally. Even using .resize() and std::copy is much faster than .assign(), because it is subject to more partial loop unrolling. basic_string::assign: https://github.com/gcc-mirror/gcc/blob/25c4b1620ebc10fceabd86a34fdbbaf8037e7e82/libstdc%2B%2B-v3/include/bits/basic_string.h#L1713C28-L1713C28 this calls the four-iterator form of .replace(): https://github.com/gcc-mirror/gcc/blob/25c4b1620ebc10fceabd86a34fdbbaf8037e7e82/libstdc%2B%2B-v3/include/bits/basic_string.h#L2378 this calls this form of _M_replace_dispatch(): (I think) https://github.com/gcc-mirror/gcc/blob/25c4b1620ebc10fceabd86a34fdbbaf8037e7e82/libstdc%2B%2B-v3/include/bits/basic_string.tcc#L430 this calls _M_replace(): https://github.com/gcc-mirror/gcc/blob/25c4b1620ebc10fceabd86a34fdbbaf8037e7e82/libstdc%2B%2B-v3/include/bits/basic_string.tcc#L507 in this case, it should call _S_move(): https://github.com/gcc-mirror/gcc/blob/25c4b1620ebc10fceabd86a34fdbbaf8037e7e82/libstdc%2B%2B-v3/include/bits/basic_string.h#L431 this calls char_traits::move(): https://github.com/gcc-mirror/gcc/blob/25c4b1620ebc10fceabd86a34fdbbaf8037e7e82/libstdc%2B%2B-v3/include/bits/char_traits.h#L223 and that calls __builtin_memcpy() However, I must have followed this chain of calls incorrectly, because I do not see a call to memmove in the output assembly, and most of the time is spent here: >nopl (%rax) >movdqa 0x42d8a0(%rdx),%xmm0 > 63.27% movups %xmm0,(%rax,%rdx,1) > 36.69% add$0x10,%rdx > 0.03% cmp$0x10,%rdx
[Bug c++/110912] False assumption that constructors cannot alias any of their parameters
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110912 --- Comment #2 from Jan Schultke --- (In reply to Jiang An from comment #1) > The restriction agains aliasing was intended, see > https://cplusplus.github.io/CWG/issues/2271.html. > > The status quo seems to be that in the body of `A::A(int &x)`, compilers can > assume that the value of `x` won't be changed by a modification on `*this`, > but not the other way around. Then this status quo is not correctly implemented, because in the example, GCC assumes that a change of `x` (see `x = 5`) cannot alter `this->i` (see `i == 0` assumed to be always true). It is not enough to put `__restrict` on the parameters; a much weaker modifier must be used for this purpose. At most, a "one-way `__restrict`" must be used, if such a thing exists.
[Bug c++/110925] New: Unnecessary dynamic initialization in trivial cases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110925 Bug ID: 110925 Summary: Unnecessary dynamic initialization in trivial cases Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: janschultke at googlemail dot com Target Milestone: --- ## Code to Reproduce int z = 0; int x = z; ## Expected Output (delivered by clang trunk -O3) x: .zero 4 z: .zero 4 ## Actual Output (x86_64 GCC 14 -O3) (https://godbolt.org/z/95d9hj3he) _GLOBAL__sub_I_z: mov eax, DWORD PTR z[rip] mov DWORD PTR x[rip], eax ret x: .zero 4 z: .zero 4 ## Explanation The implementation is allowed to turn this into static initialization. See https://eel.is/c++draft/basic.start.static#3. However, GCC emits unnecessary dynamic initialization code.
[Bug c++/110916] New: [12/13/14 Regression] Architecture-dependent missed optimizations for double swapping
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110916 Bug ID: 110916 Summary: [12/13/14 Regression] Architecture-dependent missed optimizations for double swapping Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: janschultke at googlemail dot com Target Milestone: --- GCC's ability to eliminate redundant stores and loads is oddly dependent on the architecture. Even on the same overall arch, compiling for Skylake in particular performs always performs best. On x86_64 -march=x86-64-v2, GCC 11 provides the optimal output. GCC 12/13/14 provide suboptimal output compared to -march=skylake. On ARM64, a strange load and store from/to the same register is emitted. This is the case for all version of GCC available on Compiler Explorer. ## Code to Reproduce (https://godbolt.org/z/d7Kcdn8fo) static void swap(int* restrict a, int* restrict b) { const int tmp = *a; *a = *b; *b = tmp; } void double_swap_alias(int* a, int* b) { swap(a, b); swap(a, b); } ## Expected Output (x86_64 GCC 14 -O3 -march=skylake) ret ## Actual Output (x86_64 GCC 14 -O3 -march=x86-64-v2) mov edx, DWORD PTR [rsi] mov eax, DWORD PTR [rdi] mov DWORD PTR [rdi], edx mov DWORD PTR [rsi], eax mov edx, DWORD PTR [rdi] mov DWORD PTR [rdi], eax mov DWORD PTR [rsi], edx ret ## Actual Output (ARM64 GCC 14 -O3) ldr w0, [x1] str w0, [x1] ret
[Bug c++/110912] New: False assumption that constructors cannot alias any of their parameters
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110912 Bug ID: 110912 Summary: False assumption that constructors cannot alias any of their parameters Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: janschultke at googlemail dot com Target Milestone: --- It looks like GCC thinks that no aliasing can take place betweeen `this` and any constructor parameters. You get the same output if you use `__restrict` for any pointers passed into constructors, or if you add `noalias` to LLVM IR. There is no wording in the standard that disallows aliasing in constructors completely, and this may cause more serious issues in real code See https://github.com/cplusplus/CWG/issues/206. This bug stems from an over-interpretation of [class.cdtor] p2 (https://eel.is/c++draft/class.cdtor#2), which states that an unspecified value is obtained when not accessing subobjects through the `this` pointer. However, it DOES NOT say that side effects will result in unspecified values, and it DOES NOT say that any undefined behavior is involved, which is the prerequisite to treat this as `noalias`. It seems to have been introduced here, and there is a related LLVM issue: - https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82899 - https://bugs.llvm.org/show_bug.cgi?id=37329 For reference, [class.cdtor] p2: > During the construction of an object, > if the value of the object or any of its subobjects is accessed > through a glvalue that is not obtained, > directly or indirectly, from the constructor's this pointer, > the value of the object or subobject thus obtained is unspecified. THIS WORDING IS NOT STRONG ENOUGH FOR `noalias`! This is an old paragraph which originally applied to const objects only, and has some problems with breaking legitimate code for polymorphic classes. I am not entirely sure about the intention behind it myself, but it would have been very easy for the committee to make access through such a pointer (not obtained from `this`) undefined behavior. That would actually allow `noalias`, NOT the current wording. I have talked to other knowledgeable members of the community, and so far no one believes that the wording is strong enough, or intended to disallow aliasing entirely. Related discussions: - https://lists.isocpp.org/std-discussion/2023/08/2324.php - https://lists.isocpp.org/std-discussion/2022/12/1952.php ## Code to Reproduce (https://godbolt.org/z/K4nTPYK3r) void foo(); struct A { int i = 0; [[gnu::used]] A(int &x) { x = 5; if (i == 0) { foo(); } } }; ## Actual Output (gcc trunk -O3) A::A(int&) [base object constructor]: mov DWORD PTR [rdi], 0 mov DWORD PTR [rsi], 5 jmp foo() ## Expected Output (clang trunk -O3) A::A(int&) [base object constructor]: mov dword ptr [rdi], 0 mov dword ptr [rsi], 5 cmp dword ptr [rdi], 0 je foo()@PLT ret ## Suggested Resolution Do not assume `noalias` for all parameters in a constructor.