[Bug c/100150] ice in bp_unpack_string

2021-08-10 Thread krzysztof.a.nowicki+gcc at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100150

Krzysztof Nowicki  changed:

   What|Removed |Added

 CC||krzysztof.a.nowicki+gcc@gma
   ||il.com

--- Comment #14 from Krzysztof Nowicki  ---
I'm seeing this very often now on my Gentoo system after upgrading from GCC
11.1 to 11.2. The cause of this ICE is well understood - the LTO bitstream is
incompatible. I however see this as a regression, as in previous GCC upgrades
the LTO linker always explicitly complained that a static library or object has
incompatible LTO data due to being built with an earlier compiler. With the
11.1 to 11.2 transition no such error happens, but the linker ICEs.

I suspect that this has happened due to introduction of an incompatible change
in the LTO bitstream without bumping the bitstream version.

This is very annoying, as previously - with the error message - it was
immediately clear which static library or object was at fault. Now with the ICE
I need to take a guess, which of one of the many libraries on the linker
command line is the problem.

[Bug ipa/96059] New: ICE: in remove_unreachable_nodes, at ipa.c:575 with -fdevirtualize-at-ltrans

2020-07-04 Thread krzysztof.a.nowicki+gcc at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96059

Bug ID: 96059
   Summary: ICE: in remove_unreachable_nodes, at ipa.c:575 with
-fdevirtualize-at-ltrans
   Product: gcc
   Version: 10.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: ipa
  Assignee: unassigned at gcc dot gnu.org
  Reporter: krzysztof.a.nowicki+gcc at gmail dot com
CC: marxin at gcc dot gnu.org
  Target Milestone: ---

When building the ktexteditor-5.71 from the KDE Frameworks with LTO enabled I'm
seeing an ICE when linking libKF5TextEditor.so:

during IPA pass: inline
lto1: internal compiler error: in remove_unreachable_nodes, at ipa.c:575
0xa7e802 symbol_table::remove_unreachable_nodes(_IO_FILE*)
/var/tmp/portage/sys-devel/gcc-10.1.0-r1/work/gcc-10.1.0/gcc/ipa.c:575
0x19cb14f ipa_inline
   
/var/tmp/portage/sys-devel/gcc-10.1.0-r1/work/gcc-10.1.0/gcc/ipa-inline.c:2696
0x19cb702 execute
   
/var/tmp/portage/sys-devel/gcc-10.1.0-r1/work/gcc-10.1.0/gcc/ipa-inline.c:3091
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://bugs.gentoo.org/> for instructions.

GDB backtrace:

#0  internal_error (gmsgid=gmsgid@entry=0x233d39a "in %s, at %s:%d") at
/var/tmp/portage/sys-devel/gcc-10.1.0-r1/work/gcc-10.1.0/gcc/diagnostic.c:1706
#1  0x01b09d9a in fancy_abort (file=file@entry=0x1c07468
"/var/tmp/portage/sys-devel/gcc-10.1.0-r1/work/gcc-10.1.0/gcc/ipa.c",
line=line@entry=575, function=function@entry=0x1c073fc
"remove_unreachable_nodes")
at
/var/tmp/portage/sys-devel/gcc-10.1.0-r1/work/gcc-10.1.0/gcc/diagnostic.c:1778
#2  0x00a7e803 in symbol_table::remove_unreachable_nodes
(this=0x76e8d100, file=0x0) at
/var/tmp/portage/sys-devel/gcc-10.1.0-r1/work/gcc-10.1.0/gcc/ipa.c:577
#3  0x019cb150 in ipa_inline () at
/var/tmp/portage/sys-devel/gcc-10.1.0-r1/work/gcc-10.1.0/gcc/ipa-inline.c:2696
#4  0x019cb703 in (anonymous namespace)::pass_ipa_inline::execute
(this=) at
/var/tmp/portage/sys-devel/gcc-10.1.0-r1/work/gcc-10.1.0/gcc/ipa-inline.c:3091
#5  0x00c047a3 in execute_one_pass (pass=pass@entry=0x3edc4a0) at
/var/tmp/portage/sys-devel/gcc-10.1.0-r1/work/gcc-10.1.0/gcc/passes.c:2502
#6  0x00c06052 in execute_ipa_pass_list (pass=0x3edc4a0) at
/var/tmp/portage/sys-devel/gcc-10.1.0-r1/work/gcc-10.1.0/gcc/passes.c:2929
#7  0x006442ac in do_whole_program_analysis () at
/var/tmp/portage/sys-devel/gcc-10.1.0-r1/work/gcc-10.1.0/gcc/context.h:48
#8  0x006445f7 in lto_main () at
/var/tmp/portage/sys-devel/gcc-10.1.0-r1/work/gcc-10.1.0/gcc/lto/lto.c:637
#9  0x00d1844d in compile_file () at
/var/tmp/portage/sys-devel/gcc-10.1.0-r1/work/gcc-10.1.0/gcc/toplev.c:458
#10 0x00d1b891 in do_compile () at
/var/tmp/portage/sys-devel/gcc-10.1.0-r1/work/gcc-10.1.0/gcc/toplev.c:2277
#11 0x00d1c1fd in toplev::main (this=this@entry=0x7fffd7f6,
argc=, argc@entry=36, argv=,
argv@entry=0x7fffd8f8) at
/var/tmp/portage/sys-devel/gcc-10.1.0-r1/work/gcc-10.1.0/gcc/toplev.c:2416
#12 0x01ae9eeb in main (argc=36, argv=0x7fffd8f8) at
/var/tmp/portage/sys-devel/gcc-10.1.0-r1/work/gcc-10.1.0/gcc/main.c:39

CXXFLAGS: -O2 -pipe -march=skylake -flto=3 -fgraphite-identity
-floop-nest-optimize -fdevirtualize-at-ltrans -fipa-pta
-fno-semantic-interposition

The ICE goes away after removing the -fdevirtualize-at-ltrans flag.

The backtrace above was generated from Gentoo-patched version of GCC, but it's
also reproducible with a vanilla version compiled directly from sources.

This is a regression since GCC 10 (reproducible also with latest GCC 11 trunk),
as GCC 9 compiles this package with the same CXXFLAGS without issues.

I've bisected the regression to the following commit:

commit 2bc2379be5c98d34ecbb347b2abf059aa6d94499
Author: Jan Hubicka 
Date:   Mon Nov 4 20:39:52 2019 +0100

ipa-inline-transform.c: Include ipa-utils.h


* ipa-inline-transform.c: Include ipa-utils.h
(inline_call): Set thunk_expansion flag.
* ipa-utils.h (thunk_expansion): Declare.
* ipa-devirt.c (thunk_expansion): New global var.
(devirt_node_removal_hook): Do not invalidate cache while
doing thunk expansion.

From-SVN: r277789

Maybe the "HACK alert" in the commit diff has something to do with it :)

[Bug middle-end/95052] Excess padding of partially initialized strings/char arrays

2020-05-12 Thread krzysztof.a.nowicki+gcc at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95052

--- Comment #7 from Krzysztof Nowicki  ---
Thanks for the very quick response. I've applied the patch on top of GCC 9.1
and it indeed fixes the problem we've seen on MIPS64.

[Bug target/95052] Excess padding of partially initialized strings/char arrays

2020-05-11 Thread krzysztof.a.nowicki+gcc at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95052

--- Comment #5 from Krzysztof Nowicki  ---
(In reply to Martin Sebor from comment #4)
> I don't expect the commit above to have changed anything for the latter
> form, and I would expect each back end to choose the same optimal code to
> emit in both cases.  So I don't think the commit above is a regression; it
> just exposed an inefficiency that was already present.

Yes, from the implementation point of view I agree - the missed optimization
was there all the time and this commit exposed it in one more use case, where
it wasn't seen before. However from the point of view of an application
developer on an embedded system, who has developed it with some memory
constraints in mind, a sudden increase of memory usage, possibly causing an OOM
since the memory budget on this system is very tight (legacy platform), just by
upgrading the compiler is a regression.

[Bug c/95052] Excess padding of partially initialized strings/char arrays

2020-05-11 Thread krzysztof.a.nowicki+gcc at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95052

--- Comment #3 from Krzysztof Nowicki  ---
Note, that this missed optimization is actually a regression (at least in our
case on MIPS64). Commit 23aa9f7c4637ad51587e536e245ae6adb5391bbc (released in
GCC 8.x) added the possibility to optimize initialization of a char array into
a string initialization. This means that this code:

  #include 
  int main(int argc, char *argv[])
  { 
char buf[1*1024*1024] = { 0 };
return read(0, buf, sizeof buf);
  }

generates the following binary sizes for MIPS64:

 * 13632 bytes (GCC-6.4)
 * 1061656 bytes (GCC-9.1)

[Bug c/95052] Excess padding of partially initialized strings/char arrays

2020-05-11 Thread krzysztof.a.nowicki+gcc at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95052

--- Comment #2 from Krzysztof Nowicki  ---
(In reply to Richard Biener from comment #1)
> I'm not sure what you describe as padding is padding.  Instead it's valid to
> access all elements of the array you declare and thus it must be initialized.

Not in this case, as this is a constant initializer, which is anonymous and is
never accessible from the code directly.

> What could be done is elide zero-padding parts to a memset() call.

Exactly, this is what I mean. Currently this actually happens in the generated
code (see attachment), but the part of GCC which allocates the variable in the
.rodata section doesn't know that and allocates memory for the full contents
including the zero padding.

[Bug c/95052] New: Excess padding of partially initialized strings/char arrays

2020-05-11 Thread krzysztof.a.nowicki+gcc at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95052

Bug ID: 95052
   Summary: Excess padding of partially initialized strings/char
arrays
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: krzysztof.a.nowicki+gcc at gmail dot com
  Target Milestone: ---

Created attachment 48506
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48506=edit
generated assembly (GCC 11.0 trunk, -Os -g0)

When compiling the following code with -Os:

  extern void func(char *buf, unsigned size);
  int main(int argc, char *argv[])
  {
char str[1*1024*1024] =
"fooiuhluhpiuhliuhliyfyukyfklyugkiuhpoipoipoipoipoipoipoipoipoipoipoipoipoimipoipiuhoulouihnliuhl";
char arr[1*1024*1024] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7,
8, 9, 0, 1, 6, 2, 3, 4, 5, 6, 7, 8, 9, 0, 3, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1,
2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0};
func(str, sizeof(str));
func(arr, sizeof(arr));
 }

GCC generates initializers for both local variables (str and arr) in the
.rodata section and at run-time initializes the explicit part of the variable
with the provided contents, and zero-inits the remainder.

Unfortunately the initializer stored in the .rodata section is padded up to the
target array size:

.LC0:
.string
"fooiuhluhpiuhliuhliyfyukyfklyugkiuhpoipoipoipoipoipoipoipoipoipoipoipoipoimipoipiuhoulouihnliuhl"
.zero   1048479
.LC1:
.string "\001\026\003\004\005?\007\b'"
.string "\001\002\003\004\005>\033\b1"
.string "\001\006\002\003\004\005\006\007\b\t"
.string "\003\001\002\003\004\005\006\007\b\t"
.string "\001\002\003\004\005\006\007\b\t"
.string "\001\002\003\004\005\006\007\b\t"
.string "\002\002\003\004\005>\033\b1"
.string "\001\006\002\003\004\005\006\007\b\t"
.string "\003\001\002\003\004\005\006\007\b\t"
.string "\001\002\003\004\005\006\007\b\t"
.string "\001\002\003\004\005"
.zero   1048466

This causes the resulting binary to become unnecessarily large, even though the
zero padding is completely redundant (the run-time initializer code does not
copy these bytes to the target variable, but zero-initializes them.

I suspect that this is caused by GCC not being able to distinguish between:
 - initialization of a global (or static local) variable,
 - initialization of a local variable

In the former case the contents of the variable live in the read/write data
section and are initialized by the compiler. In such case padding is necessary
as any further changes to the variable will be done in-place.

In the latter case the contents of the variable live on the stack and are
initialized from a read-only copy in the read-only data section. In such case
only the non-zero explicitly initialized part needs to be stored - any padding
can be skipped as it will not be used.

This mis-optimization occurs depending on compiler flags, architecture and size
of the array as well as the initialized part, as GCC may choose (and usually
does) to initialize the variable by using store assembly instructions with
immediate values, as this method is usually faster at the cost of increased
code size.