[Bug c/104468] with -O -g, quadratic compile time of function with __attribute__(("00")) that passes large structs by value

2022-02-24 Thread erik.carstensen at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104468

--- Comment #6 from Erik Carstensen  ---
thanks! Looks like the second change repairs __attribute__((optimize("O0")));
this leads to a smaller reproducer: the problem is reproduced if I remove that
attribute and compile with "-g -O0 -fvar-tracking" only.

The first commit somehow enables "QI vector mode", whatever that is. 0.8s still
seems like quite a lot; what happens in a recent gcc if you change the function
to, say, 

void f(void)
{
// more than 4x slower if you add R2()
R8(R256(fun(S,S,S,S);))
}

and compile with -g -O0 -fvar-tracking ? if ~6s, then I suppose we are down to
slow linear time; if ~50s, then we still have a quadratic behaviour.

[Bug c/104468] with -O -g, quadratic compile time of function with __attribute__(("00")) that passes large structs by value

2022-02-09 Thread erik.carstensen at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104468

--- Comment #3 from Erik Carstensen  ---
Do we know that some suspected underlying issue is fixed, or could it be that
the window of slowness (struct size ∈ [17,80]) just has moved?

[Bug c/104468] with -O -g, quadratic compile time of function with __attribute__(("00")) that passes large structs by value

2022-02-09 Thread erik.carstensen at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104468

--- Comment #2 from Erik Carstensen  ---
Perhaps the problem is unrelated to function calls; it seems the time is
quadratic in the number of struct literals: If I change argument types to
pointers, then the issue remains if I pass the args as ({static s_t x;
x=(s_t){{0}};&x;}), but it vanishes if I pass them as ({static s_t
x=(s_t){{0}};&x;}).

[Bug c/104468] New: with -O -g, quadratic compile time of function with __attribute__(("00")) that passes large structs by value

2022-02-09 Thread erik.carstensen at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104468

Bug ID: 104468
   Summary: with -O -g, quadratic compile time of function with
__attribute__(("00")) that passes large structs by
value
   Product: gcc
   Version: 11.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: erik.carstensen at intel dot com
  Target Milestone: ---

Created attachment 52392
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52392&action=edit
Reproducer

If a function passes many large structs by value to another function, then you
get quadratic  compile performance (O(n^2)) if the file is compiled with -O -g,
but the function is annotated with __attribute__((optimize("O0"))).

Compile time seems (approximately) quadratic independently in the number of
calls, in the number of struct function arguments, and in the size of the
struct. In other words, quadratic in the total size of passed values.

It compiles instantaneously (30s -> 0.1s) if I remove the __attribute__, or -g,
or -O, or if the struct size is changed to <=16 bytes or >=81 bytes.

It's still slow if I pass
-O -fno-auto-inc-dec -fno-branch-count-reg -fno-combine-stack-adjustments
-fno-compare-elim -fno-cprop-registers -fno-dce -fno-defer-pop  -fno-dse
-fno-forward-propagate -fno-guess-branch-probability -fno-if-conversion
-fno-if-conversion2 -fno-inline-functions-called-once -fno-ipa-modref
-fno-ipa-profile -fno-ipa-pure-const -fno-ipa-reference
-fno-ipa-reference-addressable -fno-merge-constants -fno-move-loop-invariants
-fno-omit-frame-pointer -fno-reorder-blocks -fno-shrink-wrap
-fno-shrink-wrap-separate -fno-split-wide-types -fno-ssa-backprop
-fno-ssa-phiopt -fno-tree-bit-ccp -fno-tree-ccp -fno-tree-ch
-fno-tree-coalesce-vars -fno-tree-copy-prop -fno-tree-dce
-fno-tree-dominator-opts -fno-tree-dse -fno-tree-forwprop -fno-tree-fre
-fno-tree-phiprop -fno-tree-pta -fno-tree-scev-cprop -fno-tree-sink
-fno-tree-slsr -fno-tree-sra -fno-tree-ter -fno-unit-at-a-time
... which is documented to be the same as -O0.

This happens with native gcc from Fedora 34:
$ gcc --version
gcc (GCC) 11.2.1 20210728 (Red Hat 11.2.1-1)
$ uname -a
Linux ecarsten-mobl1.ger.corp.intel.com 5.15.12-100.fc34.x86_64 #1 SMP Wed Dec
29 15:21:44 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

Also reproduced with gcc 6.4.

Command line:
$ gcc -g -O1 -c foo.c
or alternatively (to bypass ccache on my system):
$ /usr/libexec/gcc/x86_64-redhat-linux/11/cc1 -quiet foo.c -quiet -dumpbase
foo.c -dumpbase-ext .c -mtune=generic -march=x86-64 -g -O0 -o /tmp/ccFglVbD.s

This causes performance issues in C code generated by the DML compiler
(https://github.com/intel/device-modeling-language)

[Bug ipa/85233] Incorrect -Wmaybe-uninitialized with -fpartial-inlining -finline-small-functions

2018-04-06 Thread erik.carstensen at intel dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85233

--- Comment #2 from Erik Carstensen  ---
I know nothing about GCC internals, but I did make some observations on the
warning's behaviour while minimizing the test case. An unqualified guess based
on this is that intraprocedural analysis is not done unless the function is
inlined; if GCC decides to not inline a function, then the warning is
effectively suppressed in order to avoid having to do intraprocedural analysis.
A possible bug might then be that f "counts" as inlined (in the sense that the
warning is not suppressed) even if it is only partially inlined.

[Bug c/85233] New: Incorrect -Wmaybe-uninitialized with -fpartial-inlining -finline-small-functions

2018-04-05 Thread erik.carstensen at intel dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85233

Bug ID: 85233
   Summary: Incorrect -Wmaybe-uninitialized with
-fpartial-inlining -finline-small-functions
   Product: gcc
   Version: 7.3.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: erik.carstensen at intel dot com
  Target Milestone: ---

Created attachment 43858
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43858&action=edit
Reproducer

When compiling the attachment with
gcc -Wall -c foo.c -o foo.o -O1 -fpartial-inlining -finline-small-functions
I get the warning:
foo.c:26:11: warning: ‘x’ may be used uninitialized in this function
[-Wmaybe-uninitialized]
 c = x;
though it can be easily proven that x is not used uninitialized.

I tried this with 7.3.1 from fedora 27,
$ gcc -v
COLLECT_GCC=/usr/bin/gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/7/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-redhat-linux
Configured with: ../configure --enable-bootstrap
--enable-languages=c,c++,objc,obj-c++,fortran,ada,go,lto --prefix=/usr
--mandir=/usr/share/man --infodir=/usr/share/info
--with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared
--enable-threads=posix --enable-checking=release --enable-multilib
--with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions
--enable-gnu-unique-object --enable-linker-build-id
--with-gcc-major-version-only --with-linker-hash-style=gnu --enable-plugin
--enable-initfini-array --with-isl --enable-libmpx
--enable-offload-targets=nvptx-none --without-cuda-driver
--enable-gnu-indirect-function --with-tune=generic --with-arch_32=i686
--build=x86_64-redhat-linux
Thread model: posix
gcc version 7.3.1 20180303 (Red Hat 7.3.1-5) (GCC) 

I also tried some other gcc versions I had lying around. I can reproduce it in
all versions I have that support -fpartial-inlining (down to gcc 5.2.0)

While minimizing the test case I could see that the warning only seems to
happen when f is partially inlined, so a workaround is to pass
-fno-partial-inlining for affected files.