[Bug rtl-optimization/79032] New: Unaligned memory access in code generated for sparc 32 with LRA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79032 Bug ID: 79032 Summary: Unaligned memory access in code generated for sparc 32 with LRA Product: gcc Version: 7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: cederman at gaisler dot com Target Milestone: --- Target: sparc Hi, The following code compiled with -O2 -mcpu=v8 -mlra causes an unaligned memory access: typedef struct { short a; long long b; short c; _Bool d; unsigned short e; long *f } g; h(g *i) { long a = 1; a /= i->e; i->f[a]--; return 0; } With -mlra: mov 1, %g1 sra %g1, 31, %g2 wr %g2, 0, %y ld [%o0+18], %g2 <- unaligned nop nop sdiv%g1, %g2, %g1 With -mno-lra: lduh[%o0+20], %g2 <- aligned ld [%o0+24], %g3 mov 1, %g1 mov 0, %o0 sra %g1, 31, %g4 wr %g4, 0, %y nop nop nop sdiv%g1, %g2, %g1
[Bug rtl-optimization/79032] [7 regression] unaligned memory access generated with LRA and optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79032 --- Comment #9 from Daniel Cederman --- Thanks for fixing it so quickly. Everything seems to be working now on my side.
[Bug lto/61192] Conflict between register and function name for lto on sparc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61192 --- Comment #3 from Daniel Cederman --- (In reply to Ilya Palachev from comment #2) > (In reply to Daniel Cederman from comment #0) > > when using lto on sparc. > > Daniel, can you also provide original source code (not preprocessed)? It's > interesting whether this error can be reproduced on other arhictectures. I used creduce on the source code and this code triggers the error: register int _SPARC_Per_CPU_current __asm__("g6"); int __getreent___trans_tmp_1; __getreent() { int cpu_self = _SPARC_Per_CPU_current; __getreent___trans_tmp_1 = cpu_self; } g6() {} I compiled with the same compiler as before, I have not tried with a newer version of gcc.
[Bug tree-optimization/64193] New: Decreased performance after r173250
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64193 Bug ID: 64193 Summary: Decreased performance after r173250 Product: gcc Version: 4.9.2 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: cederman at gaisler dot com Target: i686-build_pc-linux-gnu Created attachment 34196 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=34196&action=edit preprocessed source When comparing version 4.4.2 with version 4.9.2 I noticed a performance decrease for the Whetstone benchmark. It seems to have been introduced with r173250. Compiling Whetstone with the latest master (r217599) I get the following numbers: Without r173250: Loops: 500, Iterations: 1, Duration: 54 sec. C Converted Double Precision Whetstones: 9259.3 MIPS With r173250: Loops: 500, Iterations: 1, Duration: 58 sec. C Converted Double Precision Whetstones: 8620.7 MIPS The assembly output has also increased in size. I have attached a preprocessed copy of the Whetstone benchmark and assembly output for i686-build_pc-linux-gnu compiled with -O3, with and without r173250. Let me know if you need any more information.
[Bug tree-optimization/64193] Decreased performance after r173250
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64193 --- Comment #1 from Daniel Cederman --- Created attachment 34197 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=34197&action=edit assembly output with r173250
[Bug tree-optimization/64193] Decreased performance after r173250
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64193 --- Comment #2 from Daniel Cederman --- Created attachment 34198 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=34198&action=edit assembly output without r173250
[Bug tree-optimization/64193] [4.8/4.9/5 Regression] Decreased performance after r173250
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64193 --- Comment #5 from Daniel Cederman --- > Probably the regression was mitigated by the partial fix for PR63677: Yes, that seems to be the case for my attached example. Do you think that the regression is mitigated in general, or is your attached fix also needed?
[Bug lto/61192] New: Conflict between register and function name for lto on sparc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61192 Bug ID: 61192 Summary: Conflict between register and function name for lto on sparc Product: gcc Version: 4.9.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: lto Assignee: unassigned at gcc dot gnu.org Reporter: cederman at gaisler dot com Created attachment 32799 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=32799&action=edit Preprocessed files This line in the attached preprocessed file: register struct Per_CPU_Control *_SPARC_Per_CPU_current __asm__( "g6" ); seems to conflict with this line: int g6() {} when using lto on sparc. If I change the function name I get no error. If I change both the register and the function name to g5 instead I get the same error. $ sparc-rtems-gcc -v bug.i -flto -flto-partition=none -O2 Using built-in specs. COLLECT_GCC=sparc-rtems-gcc COLLECT_LTO_WRAPPER=/opt/rtems-4.11/libexec/gcc/sparc-rtems/4.9.1/lto-wrapper Target: sparc-rtems Configured with: ../../gcc/configure --target=sparc-rtems --with-gnu-as --with-gnu-ld --with-newlib --verbose --enable-threads --enable-languages=c,c++ --disable-nls --prefix=/opt/rtems-4.11 --enable-version-specific-runtime-libs --with-system-zlib --disable-libstdcxx-pch --disable-win32-registry --without-included-gettext Thread model: rtems gcc version 4.9.1 20140515 (prerelease) (GCC) COLLECT_GCC_OPTIONS='-v' '-flto' '-flto-partition=none' '-O2' '-mcpu=v7' /opt/rtems-4.11/libexec/gcc/sparc-rtems/4.9.1/cc1 -fpreprocessed bug.i -quiet -dumpbase bug.i -mcpu=v7 -auxbase bug -O2 -version -flto -flto-partition=none -o /tmp/cczGxKWE.s GNU C (GCC) version 4.9.1 20140515 (prerelease) (sparc-rtems) compiled by GNU C version 4.6.3, GMP version 5.0.5, MPFR version 3.1.1, MPC version 1.0.1 GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 GNU C (GCC) version 4.9.1 20140515 (prerelease) (sparc-rtems) compiled by GNU C version 4.6.3, GMP version 5.0.5, MPFR version 3.1.1, MPC version 1.0.1 GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 Compiler executable checksum: 86c088340ea65899e71f0200a3b56ac7 In file included from /opt/rtems-4.11/sparc-rtems/leon3/lib/include/rtems/score/cpuatomic.h:12:0, from /opt/rtems-4.11/sparc-rtems/leon3/lib/include/rtems/score/atomic.h:21, from /opt/rtems-4.11/sparc-rtems/leon3/lib/include/rtems/score/smplock.h:27, from /opt/rtems-4.11/sparc-rtems/leon3/lib/include/rtems/score/percpu.h:28, from /opt/rtems-4.11/sparc-rtems/leon3/lib/include/rtems/system.h:23, from /opt/rtems-4.11/sparc-rtems/leon3/lib/include/rtems.h:29, from bug.c:1: /opt/rtems-4.11/sparc-rtems/leon3/lib/include/rtems/score/cpustdatomic.h: In function '_CPU_atomic_Store_ptr': /opt/rtems-4.11/sparc-rtems/leon3/lib/include/rtems/score/cpustdatomic.h:179:117: warning: initialization makes integer from pointer without a cast atomic_store_explicit( object, pointer, (memory_order) order ); ^ In file included from /opt/rtems-4.11/sparc-rtems/leon3/lib/include/rtems/score/cpuatomic.h:12:0, from /opt/rtems-4.11/sparc-rtems/leon3/lib/include/rtems/score/atomic.h:21, from /opt/rtems-4.11/sparc-rtems/leon3/lib/include/rtems/score/smplock.h:27, from /opt/rtems-4.11/sparc-rtems/leon3/lib/include/rtems/score/percpu.h:28, from /opt/rtems-4.11/sparc-rtems/leon3/lib/include/rtems/system.h:23, from /opt/rtems-4.11/sparc-rtems/leon3/lib/include/rtems.h:29, from bug.c:1: /opt/rtems-4.11/sparc-rtems/leon3/lib/include/rtems/score/cpustdatomic.h: In function '_CPU_atomic_Compare_exchange_ptr': /opt/rtems-4.11/sparc-rtems/leon3/lib/include/rtems/score/cpustdatomic.h:412:157: warning: initialization makes integer from pointer without a cast return atomic_compare_exchange_strong_explicit( object, old_pointer, ^ COLLECT_GCC_OPTIONS='-v' '-flto' '-flto-partition=none' '-O2' '-mcpu=v7' /opt/rtems-4.11/lib/gcc/sparc-rtems/4.9.1/../../../../sparc-rtems/bin/as -v -s -o /tmp/ccurbDPh.o /tmp/cczGxKWE.s GNU assembler version 2.23.52 (sparc-rtems) using BFD version (GNU Binutils) 2.23.52.20130918 COMPILER_PATH=/opt/rtems-4.11/libexec/gcc/sparc-rtems/4.9.1/:/opt/rtems-4.11/libexec/gcc/sparc-rtems/4.9.1/:/opt/rtems-4.11/libexec/gcc/sparc-rtems/:/opt/rtems-4.11/lib/gcc/sparc-rtems/4.9.1/:/opt/rtems-4.11/lib/gcc/sparc-r
[Bug rtl-optimization/102306] New: Volatile pointer dereferenced twice
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102306 Bug ID: 102306 Summary: Volatile pointer dereferenced twice Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: cederman at gaisler dot com CC: ebotcazou at libertysurf dot fr, segher at kernel dot crashing.org Target Milestone: --- Created attachment 51448 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51448&action=edit Test case The following code (full case in attachment) generates two loads from the volatile "a" pointer on SPARC: C char b = *(volatile unsigned char *)a; if (!b) return; Asm sethi %hi(a), %g1 ld [%g1+%lo(a)], %g1 ldub[%g1], %g2 ldub[%g1], %g1 cmp %g1, 0 The code is intended to read a memory mapped register that will change after a read, so two reads will give the wrong value to the comparison. A bisect showed that this started to happen after "c4c5ad1d6d1e1e1fe7a: combine: Allow combining two insns to two insns". The compiler (on commit fc4a29c0781186269dc, latest master at the time) was configured with: configure --target=sparc-gaisler-elf --enable-languages=c --disable-nls --disable-libmudflap --disable-libssp --enable-version-specific-runtime-libs --disable-fixed-point --disable-decimal-float --disable-shared and the test was compiled with: sparc-gaisler-elf -mcpu=v8 -O2 I do not know if this is a generic problem or a SPARC specific one.
[Bug target/107248] wrong scheduling of stack adjustment in leaf function at -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107248 Daniel Cederman changed: What|Removed |Added CC||cederman at gaisler dot com --- Comment #12 from Daniel Cederman --- Just to make it clear, since we have had customers asking about it, it is still possible to trigger this issue with -mtune=leon or -mtune-leon3, though it might make it less likely to happen.