[Bug tree-optimization/105216] [12/13 regression] 8% regression for m-queens compared to gcc11 O2 on CLX. since r12-3876-g4a960d548b7d7d94
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105216 --- Comment #15 from Andrew Pinski --- Might be interesting to test it again to see if it has been fixed on the trunk.
[Bug target/105010] [12/13 regression] GCC 12 after 20220227 fails to build on powerpc64-freebsd with Error: invalid mfcr mask
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105010 --- Comment #17 from Piotr Kubaj --- This also affects GCC 10.4 and the snapshots of GCC 11.
[Bug target/108255] Repeated address-of (lea) not optimized for size.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108255 --- Comment #1 from Andrew Pinski --- I suspect r0-127773-g3e7291458b96 changed the behavior for GCC 4.9+ I have not figured out what changed the behavior for GCC 4.8 yet though. I suspect it was just a mistake that GCC 4.8 cost model was incorrect really. LLVM might be not tuning correctly anyways ... Also note ICC (not ICX) does the same as GCC ... So I think this is just a LLVM issue rather than a GCC issue. Someone who knows more about the x86 processors behavior can explain more.
[Bug c/108255] New: Repeated address-of (lea) not optimized for size.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108255 Bug ID: 108255 Summary: Repeated address-of (lea) not optimized for size. Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: witold.baryluk+gcc at gmail dot com Target Milestone: --- https://godbolt.org/z/q5sx9e49j void f(int *); int g(int of) { int x = 13; f(); f(); f(); f(); f(); f(); f(); f(); return 0; } Got: g(int): sub rsp, 24 lea rdi, [rsp+12] mov DWORD PTR [rsp+12], 13 callf(int*) lea rdi, [rsp+12] # compute, 5 bytes callf(int*) lea rdi, [rsp+12] # recompute, 5 bytes callf(int*) lea rdi, [rsp+12] # recompute, 5 bytes callf(int*) lea rdi, [rsp+12] callf(int*) lea rdi, [rsp+12] callf(int*) lea rdi, [rsp+12] callf(int*) lea rdi, [rsp+12] callf(int*) xor eax, eax add rsp, 24 ret But, note that lea is 5 bytes. Expected (generated by clang 3.0 - 15.0): g(int): # @g(int) pushrbx # extra, but just 1 byte sub rsp, 16 mov dword ptr [rsp + 12], 13 # CSE temp lea rbx, [rsp + 12] mov rdi, rbx # use callf(int*)@PLT mov rdi, rbx # reuse, 3 bytes callf(int*)@PLT mov rdi, rbx # reuse, 3 bytes callf(int*)@PLT mov rdi, rbx callf(int*)@PLT mov rdi, rbx callf(int*)@PLT mov rdi, rbx callf(int*)@PLT mov rdi, rbx callf(int*)@PLT mov rdi, rbx callf(int*)@PLT xor eax, eax add rsp, 16 pop rbx # extra, but just 1 byte ret Technically this is more instructions. But mov rdi, rbx is 3 bytes, which is shorter than 5 bytes of lea. This is at minor expense of needing to save and restore rbx. PS. Same happens when using temporary `int *const y = ` Also same when optimizing for size (`-Os`). It looks like gcc 4.8.5 produced expected code, but gcc 4.9.0 does not. It is possible that the code produced by gcc 4.9.0 is faster, but it is also likely it contributes quite a bit to binary size. clang uses CSE even if there are even just two uses of `` in the above example. It is likely a bit higher threshold is (3 or 4) is actually optimal (can be calculated knowing encoding sizes). Weirdly tho, gcc -m32 does this: g(): pushebp mov ebp, esp pushebx lea ebx, [ebp-12] sub esp, 32 mov DWORD PTR [ebp-12], 13 pushebx callf(int*) mov DWORD PTR [esp], ebx callf(int*) mov DWORD PTR [esp], ebx callf(int*) mov ebx, DWORD PTR [ebp-4] xor eax, eax leave ret Where, it does compute address and stores it in temporary. But does it on a stack, instead in a register (my guess is there are no free register to store it and it is spilled)., but in fact lea here would be likely faster (mov DWORD PTR [esp], ebx, but requires memory/cache access, lea is 5 bytes, but does not require memory access)
[Bug c++/104944] [9/10 Regression] incorrect alignas(void) accepted (with warning if templated)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104944 Andrew Pinski changed: What|Removed |Added Known to fail|12.0|11.2.0 Target Milestone|9.5 |11.3 Known to work||11.3.0, 12.1.0
[Bug middle-end/35560] Missing CSE/PRE for memory operations involved in virtual call.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=35560 Witold Baryluk changed: What|Removed |Added CC||witold.baryluk+gcc at gmail dot co ||m --- Comment #15 from Witold Baryluk --- I know this is a pretty old bug, but I was exploring some assembly of gcc and clang on godbolt, and also stumbled into same issue. https://godbolt.org/z/qPzMhWse1 class A { public: virtual int f7(int x) const; }; int g(const A * const a, int x) { int r = 0; for (int i = 0; i < 1; i++) r += a->f7(x); return r; } (same happens without loop, when just calling a->f7 multiple times) g(A const*, int): pushr13 mov r13d, esi pushr12 xor r12d, r12d pushrbp mov rbp, rdi pushrbx mov ebx, 1 sub rsp, 8 .L2: mov rax, QWORD PTR [rbp+0] # a vtable deref mov esi, r13d mov rdi, rbp call[QWORD PTR [rax]]# f7 indirect call add r12d, eax dec ebx jne .L2 add rsp, 8 pop rbx pop rbp mov eax, r12d pop r12 pop r13 ret I was expecting mov rax, QWORD PTR [rbp+0] and call[QWORD PTR [rax]], to be hoisted out of the loop (call converted to lea, and call register). A bit sad. Is there some recent work done on this optimization? Are there at least some cases where it is valid to do CSE, or change code so it is moved out of the loop?
[Bug c++/105200] user-defined operator <=> for enumerated types is ignored
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105200 Barry Revzin changed: What|Removed |Added CC||barry.revzin at gmail dot com --- Comment #6 from Barry Revzin --- This strikes me as a definite wording issue rather than actual design intent. Patrick is correct as to what the wording says - it says non-member candidate, so the rewritten candidates don't count. But I think really we should also consider rewritten candidates. Clang and MSVC both do - which seems much more in line with expectation and the original design. For class types, you can just provide <=>, but for enums, you have to provide <, >, <=, >=, and <=>?? I'm opening a Core issue for this: https://github.com/cplusplus/CWG/issues/205
[Bug c++/108254] New: Usage of requires expression with an immedietely invoked lambda expression results in compile error instead of evaluating to false
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108254 Bug ID: 108254 Summary: Usage of requires expression with an immedietely invoked lambda expression results in compile error instead of evaluating to false Product: gcc Version: 12.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: avr5309 at gmail dot com Target Milestone: --- Substitution failure of an immedietely invoked lambda expression within an requires expression results in a compile error instead of the requires expression evaluating to 'false' (by my understanding of http://eel.is/c++draft/expr.prim.req#general-5 . Clang 15 comforms to my expectation). ### SOURCE: The single source file (bug.cpp):- template concept Container = requires(T t) { { [](T const& t){ for(auto&& v : t) ; }(t) }; }; int main() { static_assert(!Container); } ### COMPILER INVOCATION: g++ -fsyntax-only -std=c++20 -Wall -Wextra -pedantic-errors -xc++ - ### ACTUAL OUTPUT: The following error message:- bug.cpp: In lambda function: bug.cpp:5:13: error: 'begin' was not declared in this scope 5 | for(auto&& v : t) | ^~~ bug.cpp:5:13: error: 'end' was not declared in this scope bug.cpp: In function 'int main()': bug.cpp:12:19: error: static assertion failed 12 | static_assert(!Container); | ^~~ ### EXPECTED OUTPUT: (clean compile) ### COMPILER VERSION INFO (g++ -v): Reading specs from /usr/lib64/gcc/x86_64-unknown-linux-gnu/12.2.0/specs COLLECT_GCC=g++ COLLECT_LTO_WRAPPER=/usr/lib64/gcc/x86_64-unknown-linux-gnu/12.2.0/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: /builddir/gcc-12.2.0/configure --build=x86_64-unknown-linux-gnu --enable-gnu-unique-object --enable-vtable-verify --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --libexecdir=/usr/lib64 --libdir=/usr/lib64 --enable-threads=posix --enable-__cxa_atexit --disable-multilib --with-system-zlib --enable-shared --enable-lto --enable-plugins --enable-linker-build-id --disable-werror --disable-nls --enable-default-pie --enable-default-ssp --enable-checking=release --disable-libstdcxx-pch --with-isl --with-linker-hash-style=gnu --disable-sjlj-exceptions --disable-target-libiberty --enable-languages=c,c++,objc,obj-c++,fortran,lto,go,ada Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 12.2.0 (GCC)
[Bug preprocessor/108244] [13 Regression] `pragma GCC diagnostic` and -E -fdirectives-only causes the preprocessor to become confused
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108244 --- Comment #8 from Lewis Hyatt --- Here is the patch: https://gcc.gnu.org/pipermail/gcc-patches/2022-December/609275.html
[Bug libstdc++/108235] FAIL: g++.dg/compat/abi/bitfield1 cp_compat_x_tst.o-cp_compat_y_tst.o link
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108235 Jonathan Wakely changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2022-12-30 Status|UNCONFIRMED |ASSIGNED
[Bug libstdc++/108235] FAIL: g++.dg/compat/abi/bitfield1 cp_compat_x_tst.o-cp_compat_y_tst.o link
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108235 Jonathan Wakely changed: What|Removed |Added Target Milestone|--- |13.0 Assignee|unassigned at gcc dot gnu.org |redi at gcc dot gnu.org
[Bug pch/105858] MinGW-w64 64-bit build with --libstdcxx-pch: fatal error: cannot write PCH file: required memory segment unavailable
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105858 --- Comment #8 from Brecht Sanders --- I seem to be having some success after applying patches based on: https://github.com/msys2/MINGW-packages/blob/master/mingw-w64-gcc/0010-Fix-using-large-PCH.patch https://github.com/msys2/MINGW-packages/blob/master/mingw-w64-gcc/0021-PR14940-Allow-a-PCH-to-be-mapped-to-a-different-addr.patch My patch for GCC 12.2.0 looks like this: patch -ulbf gcc/config/i386/host-mingw32.cc << EOF @@ -46,5 +46,2 @@ -/* FIXME: Is this big enough? */ -static const size_t pch_VA_max_size = 128 * 1024 * 1024; - /* Granularity for reserving address space. */ @@ -90,5 +87,2 @@ void* res; - size = (size + va_granularity - 1) & ~(va_granularity - 1); - if (size > pch_VA_max_size) -return NULL; @@ -102,3 +96,3 @@ - res = VirtualAlloc (NULL, pch_VA_max_size, + res = VirtualAlloc (NULL, size, MEM_RESERVE | MEM_TOP_DOWN, @@ -143,3 +137,2 @@ OSVERSIONINFO version_info; - int r; @@ -152,3 +145,3 @@ this to work. We can't change the offset. */ - if ((offset & (va_granularity - 1)) != 0 || size > pch_VA_max_size) + if ((offset & (va_granularity - 1)) != 0) return -1; @@ -177,21 +170,20 @@ - /* Retry five times, as here might occure a race with multiple gcc's - instances at same time. */ - for (r = 0; r < 5; r++) - { - mmap_addr = MapViewOfFileEx (mmap_handle, FILE_MAP_COPY, 0, offset, - size, addr); - if (mmap_addr == addr) - break; - if (r != 4) -Sleep (500); - } - - if (mmap_addr != addr) + /* Try mapping the file at \`addr\`. */ + mmap_addr = MapViewOfFileEx (mmap_handle, FILE_MAP_COPY, 0, offset, + size, addr); + if (mmap_addr == NULL) { - w32_error (__FUNCTION__, __FILE__, __LINE__, "MapViewOfFileEx"); - CloseHandle(mmap_handle); - return -1; + /* We could not map the file at its original address, so let the +system choose a different one. The PCH can be relocated later. */ + mmap_addr = MapViewOfFileEx (mmap_handle, FILE_MAP_COPY, 0, offset, + size, NULL); + if (mmap_addr == NULL) + { + w32_error (__FUNCTION__, __FILE__, __LINE__, "MapViewOfFileEx"); + CloseHandle(mmap_handle); + return -1; + } } + addr = mmap_addr; return 1; EOF
[Bug target/107714] MVE: Invalid addressing mode generated for VLD2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107714 --- Comment #6 from CVS Commits --- The master branch has been updated by Stam Markianos-Wright : https://gcc.gnu.org/g:4269a6567eb991e6838f40bda5be9e3a7972530c commit r13-4935-g4269a6567eb991e6838f40bda5be9e3a7972530c Author: Stam Markianos-Wright Date: Fri Dec 30 11:25:22 2022 + Fix memory constraint on MVE v[ld/st][2/4] instructions [PR107714] In the M-Class Arm-ARM: https://developer.arm.com/documentation/ddi0553/bu/?lang=en these MVE instructions only have '!' writeback variant and at: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107714 we found that the Um constraint would also allow through a register offset writeback, resulting in an assembler error. Here I have added a new constraint and predicate for these instructions, which (uniquely, AFAICT), only support a `!` writeback increment by the data size (inside the compiler this is a POST_INC). No regressions in arm-none-eabi with MVE and MVE.FP. gcc/ChangeLog: PR target/107714 * config/arm/arm-protos.h (mve_struct_mem_operand): New protoype. * config/arm/arm.cc (mve_struct_mem_operand): New function. * config/arm/constraints.md (Ug): New constraint. * config/arm/mve.md (mve_vst4q): Change constraint. (mve_vst2q): Likewise. (mve_vld4q): Likewise. (mve_vld2q): Likewise. * config/arm/predicates.md (mve_struct_operand): New predicate. gcc/testsuite/ChangeLog: PR target/107714 * gcc.target/arm/mve/intrinsics/vldst24q_reg_offset.c: New test.
[Bug target/108232] Bus & Segmentation error on lz4_decompress.c while make linux-raspi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108232 --- Comment #10 from Filip Roland --- Hi! The gcc-13 not released yet but i make 'with' gcc-12 and that went through at lz4_decompress.o... Thanks!