[Bug c/104468] with -O -g, quadratic compile time of function with __attribute__(("00")) that passes large structs by value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104468 --- Comment #6 from Erik Carstensen --- thanks! Looks like the second change repairs __attribute__((optimize("O0"))); this leads to a smaller reproducer: the problem is reproduced if I remove that attribute and compile with "-g -O0 -fvar-tracking" only. The first commit somehow enables "QI vector mode", whatever that is. 0.8s still seems like quite a lot; what happens in a recent gcc if you change the function to, say, void f(void) { // more than 4x slower if you add R2() R8(R256(fun(S,S,S,S);)) } and compile with -g -O0 -fvar-tracking ? if ~6s, then I suppose we are down to slow linear time; if ~50s, then we still have a quadratic behaviour.
[Bug c/104468] with -O -g, quadratic compile time of function with __attribute__(("00")) that passes large structs by value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104468 --- Comment #3 from Erik Carstensen --- Do we know that some suspected underlying issue is fixed, or could it be that the window of slowness (struct size ∈ [17,80]) just has moved?
[Bug c/104468] with -O -g, quadratic compile time of function with __attribute__(("00")) that passes large structs by value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104468 --- Comment #2 from Erik Carstensen --- Perhaps the problem is unrelated to function calls; it seems the time is quadratic in the number of struct literals: If I change argument types to pointers, then the issue remains if I pass the args as ({static s_t x; x=(s_t){{0}};&x;}), but it vanishes if I pass them as ({static s_t x=(s_t){{0}};&x;}).
[Bug c/104468] New: with -O -g, quadratic compile time of function with __attribute__(("00")) that passes large structs by value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104468 Bug ID: 104468 Summary: with -O -g, quadratic compile time of function with __attribute__(("00")) that passes large structs by value Product: gcc Version: 11.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: erik.carstensen at intel dot com Target Milestone: --- Created attachment 52392 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52392&action=edit Reproducer If a function passes many large structs by value to another function, then you get quadratic compile performance (O(n^2)) if the file is compiled with -O -g, but the function is annotated with __attribute__((optimize("O0"))). Compile time seems (approximately) quadratic independently in the number of calls, in the number of struct function arguments, and in the size of the struct. In other words, quadratic in the total size of passed values. It compiles instantaneously (30s -> 0.1s) if I remove the __attribute__, or -g, or -O, or if the struct size is changed to <=16 bytes or >=81 bytes. It's still slow if I pass -O -fno-auto-inc-dec -fno-branch-count-reg -fno-combine-stack-adjustments -fno-compare-elim -fno-cprop-registers -fno-dce -fno-defer-pop -fno-dse -fno-forward-propagate -fno-guess-branch-probability -fno-if-conversion -fno-if-conversion2 -fno-inline-functions-called-once -fno-ipa-modref -fno-ipa-profile -fno-ipa-pure-const -fno-ipa-reference -fno-ipa-reference-addressable -fno-merge-constants -fno-move-loop-invariants -fno-omit-frame-pointer -fno-reorder-blocks -fno-shrink-wrap -fno-shrink-wrap-separate -fno-split-wide-types -fno-ssa-backprop -fno-ssa-phiopt -fno-tree-bit-ccp -fno-tree-ccp -fno-tree-ch -fno-tree-coalesce-vars -fno-tree-copy-prop -fno-tree-dce -fno-tree-dominator-opts -fno-tree-dse -fno-tree-forwprop -fno-tree-fre -fno-tree-phiprop -fno-tree-pta -fno-tree-scev-cprop -fno-tree-sink -fno-tree-slsr -fno-tree-sra -fno-tree-ter -fno-unit-at-a-time ... which is documented to be the same as -O0. This happens with native gcc from Fedora 34: $ gcc --version gcc (GCC) 11.2.1 20210728 (Red Hat 11.2.1-1) $ uname -a Linux ecarsten-mobl1.ger.corp.intel.com 5.15.12-100.fc34.x86_64 #1 SMP Wed Dec 29 15:21:44 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux Also reproduced with gcc 6.4. Command line: $ gcc -g -O1 -c foo.c or alternatively (to bypass ccache on my system): $ /usr/libexec/gcc/x86_64-redhat-linux/11/cc1 -quiet foo.c -quiet -dumpbase foo.c -dumpbase-ext .c -mtune=generic -march=x86-64 -g -O0 -o /tmp/ccFglVbD.s This causes performance issues in C code generated by the DML compiler (https://github.com/intel/device-modeling-language)
[Bug ipa/85233] Incorrect -Wmaybe-uninitialized with -fpartial-inlining -finline-small-functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85233 --- Comment #2 from Erik Carstensen --- I know nothing about GCC internals, but I did make some observations on the warning's behaviour while minimizing the test case. An unqualified guess based on this is that intraprocedural analysis is not done unless the function is inlined; if GCC decides to not inline a function, then the warning is effectively suppressed in order to avoid having to do intraprocedural analysis. A possible bug might then be that f "counts" as inlined (in the sense that the warning is not suppressed) even if it is only partially inlined.
[Bug c/85233] New: Incorrect -Wmaybe-uninitialized with -fpartial-inlining -finline-small-functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85233 Bug ID: 85233 Summary: Incorrect -Wmaybe-uninitialized with -fpartial-inlining -finline-small-functions Product: gcc Version: 7.3.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: erik.carstensen at intel dot com Target Milestone: --- Created attachment 43858 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43858&action=edit Reproducer When compiling the attachment with gcc -Wall -c foo.c -o foo.o -O1 -fpartial-inlining -finline-small-functions I get the warning: foo.c:26:11: warning: ‘x’ may be used uninitialized in this function [-Wmaybe-uninitialized] c = x; though it can be easily proven that x is not used uninitialized. I tried this with 7.3.1 from fedora 27, $ gcc -v COLLECT_GCC=/usr/bin/gcc COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/7/lto-wrapper OFFLOAD_TARGET_NAMES=nvptx-none OFFLOAD_TARGET_DEFAULT=1 Target: x86_64-redhat-linux Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,objc,obj-c++,fortran,ada,go,lto --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --with-isl --enable-libmpx --enable-offload-targets=nvptx-none --without-cuda-driver --enable-gnu-indirect-function --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux Thread model: posix gcc version 7.3.1 20180303 (Red Hat 7.3.1-5) (GCC) I also tried some other gcc versions I had lying around. I can reproduce it in all versions I have that support -fpartial-inlining (down to gcc 5.2.0) While minimizing the test case I could see that the warning only seems to happen when f is partially inlined, so a workaround is to pass -fno-partial-inlining for affected files.