[Bug target/91269] [9/10 regression] unaligned floating-point register with -mcpu=niagara4 -fcall-used-g6

2019-10-01 Thread mattst88 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91269

--- Comment #21 from Matt Turner  ---
(In reply to Eric Botcazou from comment #16)
> > I believe the Known to work field is wrong and gcc-8.3.0 has this bug as
> > well.
> 
> No, the field is correct and you're wrong.

Funny how the fix for the gcc-8.3.0 branch was a backport of this patch!

Have some manners next time.

[Bug target/91269] [9/10 regression] unaligned floating-point register with -mcpu=niagara4 -fcall-used-g6

2019-09-20 Thread mattst88 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91269

--- Comment #15 from Matt Turner  ---
I believe the Known to work field is wrong and gcc-8.3.0 has this bug as well.
Can we have this backported to the gcc-8 branch?

Thank you!

(FWIW, we also discovered that lz4-1.8.3 fails to build on 64-bit sparc due to
tihs bug as well)

[Bug target/91269] sparc64-gcc fails to build glibc (-fcall-used-g6) on niagara4: Assembler messages: Error: Illegal operands

2019-07-26 Thread mattst88 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91269

--- Comment #7 from Matt Turner  ---
(In reply to Sergei Trofimovich from comment #4)
> > Commenting out line '145  std %f9, [%fp+1999]' does not make
> > error disappear. Line numbers are probably skewed.
>
> Perhaps 1999 is too large an offset for 'std'.

Sergei noticed that 'std' must take an even numbered register, so s/f9/f8/ on
that line causes it to assemble.

[Bug target/91269] sparc64-gcc fails to build glibc (-fcall-used-g6) on niagara4: Assembler messages: Error: Illegal operands

2019-07-26 Thread mattst88 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91269

--- Comment #6 from Matt Turner  ---
(In reply to Matt Turner from comment #5)
> With -mcpu=niagara4 and *without* -fcall-used-g6 it compiles fine.

Also doesn't occur with -O1 or -mno-lra.

[Bug target/91269] sparc64-gcc fails to build glibc (-fcall-used-g6) on niagara4: Assembler messages: Error: Illegal operands

2019-07-26 Thread mattst88 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91269

Matt Turner  changed:

   What|Removed |Added

 CC||mattst88 at gmail dot com

--- Comment #5 from Matt Turner  ---
With -mcpu=niagara4 and *without* -fcall-used-g6 it compiles fine.

[Bug middle-end/87256] hppa spends huge amount of time in synth_mult()

2019-03-17 Thread mattst88 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87256

--- Comment #8 from Matt Turner  ---
This xxhash.c file is embedded in many different projects, and is really
causing problems on gentoo/hppa:

zstandard: Fri Mar 15 14:16:42 2019: 7 hours, 29 minutes, 49 seconds

Are we any closer to a fix than we were six months ago?

[Bug target/85235] [mips] Error: branch out of range

2018-04-05 Thread mattst88 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85235

Matt Turner  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #1 from Matt Turner  ---
INVALID. This was caused by a bad binutils patch.

[Bug rtl-optimization/85235] New: [mips] Error: branch out of range

2018-04-05 Thread mattst88 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85235

Bug ID: 85235
   Summary: [mips] Error: branch out of range
   Product: gcc
   Version: 6.4.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mattst88 at gmail dot com
  Target Milestone: ---

Created attachment 43859
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43859=edit
preprocessed (gzip'd) source file

On mips when attempting to compile cython's (cython.org) Code.c file, gcc
produces code that gas is unable to assemble, with

{standard input}:417146: Error: branch out of range
{standard input}:417161: Error: branch out of range
...

MIPS' gas supports --relax-branch which seems like it's supposed to be able to
sort this out, but it doesn't seem to.

Fails on gcc 6.4.0 and 7.3.0. Other versions untested. Totally guessing on the
Component field.

The attached Code.i demonstrates the problem with

mips64el-unknown-linux-gnu-gcc -mabi=n32 -O2 -march=loongson3a -mplt -pipe
-fno-strict-aliasing -fPIC -x c -c Code.i -o Code.o

[Bug target/71118] [5 Regression] ftois instruction not emitted for float -> int bitcast

2016-05-14 Thread mattst88 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71118

--- Comment #1 from Matt Turner  ---
(In reply to Matt Turner from comment #0)
> Created attachment 38490 [details]
> ftois.c
> 
> For the attached ftois.c, gcc-4.9.3 -O2 -mcpu=ev67 emits
> 
>  :
>0:   01 0f 1f 72 ftois   $f16,t0
>4:   00 00 e1 43 sextl   t0,v0

I might note that in the working case, the sextl is unnecessary. ftois does:

   Rc<63:32> ← SEXT(Fav<63>)
   Rc<31:0> ← Fav<63:62> || Fav <58:29>

[Bug rtl-optimization/11488] Pre-regalloc scheduling severely worsens performance

2016-05-14 Thread mattst88 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11488

Matt Turner  changed:

   What|Removed |Added

 CC||mattst88 at gmail dot com

--- Comment #11 from Matt Turner  ---
(In reply to Steven Bosscher from comment #10)
> Someone should try with -fsched-pressure...

On alpha with gcc-5.3.0:

% gcc -O2 -mbwx idct3.c && time ./a.out 
./a.out  1.72s user 0.00s system 99% cpu 1.721 total

% gcc -O2 -fno-schedule-insns -mbwx idct3.c && time ./a.out 
./a.out  0.96s user 0.01s system 99% cpu 0.970 total

% gcc -O2 -fsched-pressure -mbwx idct3.c && time ./a.out 
./a.out  1.01s user 0.00s system 99% cpu 1.016 total

(-mbwx is needed, otherwise -fsched-pressure/-fno-schedule-insns doesn't show
any benefit)

So it looks like -fsched-pressure helps significantly, but not quite as much as
-fno-schedule-insns.

[Bug rtl-optimization/71119] New: [4.9 Regression] ftoit instruction not emitted for double -> long bitcast

2016-05-14 Thread mattst88 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71119

Bug ID: 71119
   Summary: [4.9 Regression] ftoit instruction not emitted for
double -> long bitcast
   Product: gcc
   Version: 4.9.3
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mattst88 at gmail dot com
  Target Milestone: ---

Created attachment 38491
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38491=edit
ftoit.c

For the attached ftoit.c, gcc-4.8.5 -O2 -mcpu=ev67 emits

 :
   0:   00 0e 1f 72 ftoit   $f16,v0
   4:   00 00 e0 43 sextl   v0,v0
   8:   01 80 fa 6b ret
   c:   00 00 fe 2f unop

0010 :
  10:   00 0e 1f 72 ftoit   $f16,v0
  14:   01 80 fa 6b ret
  18:   1f 04 ff 47 nop
  1c:   00 00 fe 2f unop

while gcc-4.9.3/gcc-5.3.0 -O2 -mcpu=ev67 emits

 :
   0:   00 0e 1f 72 ftoit   $f16,v0
   4:   00 00 e0 43 sextl   v0,v0
   8:   01 80 fa 6b ret
   c:   00 00 fe 2f unop

0010 :
  10:   f0 ff de 23 lda sp,-16(sp)
  14:   00 00 1e 9e stt $f16,0(sp)
  18:   00 00 1e a4 ldq v0,0(sp)
  1c:   10 00 de 23 lda sp,16(sp)
  20:   01 80 fa 6b ret
  24:   00 00 fe 2f unop
  28:   1f 04 ff 47 nop
  2c:   00 00 fe 2f unop

In fact, the alpha architecture reference says that ftoit is exactly equivalent
to an stt/ldq sequence.

f2l should have used ftoit, as with gcc-4.8.5.

[Bug rtl-optimization/71118] New: [5 Regression] ftois instruction not emitted for float -> int bitcast

2016-05-14 Thread mattst88 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71118

Bug ID: 71118
   Summary: [5 Regression] ftois instruction not emitted for float
-> int bitcast
   Product: gcc
   Version: 5.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mattst88 at gmail dot com
  Target Milestone: ---

Created attachment 38490
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38490=edit
ftois.c

For the attached ftois.c, gcc-4.9.3 -O2 -mcpu=ev67 emits

 :
   0:   01 0f 1f 72 ftois   $f16,t0
   4:   00 00 e1 43 sextl   t0,v0
   8:   01 80 fa 6b ret
   c:   00 00 fe 2f unop

0010 :
  10:   01 0f 1f 72 ftois   $f16,t0
  14:   20 f6 21 48 zapnot  t0,0xf,v0
  18:   01 80 fa 6b ret
  1c:   00 00 fe 2f unop


while gcc-5.3.0 -O2 -mcpu=ev67 emits

 :
   0:   f0 ff de 23 lda sp,-16(sp)
   4:   00 00 1e 9a sts $f16,0(sp)
   8:   00 00 1e a0 ldl v0,0(sp)
   c:   10 00 de 23 lda sp,16(sp)
  10:   01 80 fa 6b ret
  14:   00 00 fe 2f unop
  18:   1f 04 ff 47 nop
  1c:   00 00 fe 2f unop

0020 :
  20:   01 0f 1f 72 ftois   $f16,t0
  24:   20 f6 21 48 zapnot  t0,0xf,v0
  28:   01 80 fa 6b ret
  2c:   00 00 fe 2f unop

In fact, the alpha architecture reference says that ftois is exactly equivalent
to an sts/ldl sequence.

f2i should have used ftois, as with gcc-4.9.3.

[Bug tree-optimization/68548] New: gcc wrongly warns about uninitialized data

2015-11-25 Thread mattst88 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68548

Bug ID: 68548
   Summary: gcc wrongly warns about uninitialized data
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mattst88 at gmail dot com
  Target Milestone: ---

Created attachment 36842
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36842=edit
u.c

gcc -Wmaybe-uninitialized wrongly warns about data0 being uninitialized in the
attached test case.

This may be a duplicate of https://gcc.gnu.org/bugzilla/show_bug.cgi?id=42145

[Bug target/45941] Failed compile on Loongson2f

2013-07-19 Thread mattst88 at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45941

--- Comment #3 from Matt Turner mattst88 at gmail dot com ---
(In reply to Steve Ellcey from comment #2)
 Since the gentoo bug report (pointed at by comment #1) is closed and I think

Well, the reason I closed the bug was because Upstream doesn't give a shit so
that shouldn't be justification :)

 the 4.5 branch of GCC is also closed now and the bug wasn't reproducible
 with GCC 4.6.1, I think this bug report should be closed.  Any objections?

But yes, gcc-4.5 is closed and 4.6+ doesn't have the problem, so we should
close this bug.


[Bug rtl-optimization/49682] [alpha] gcc-4.6.1: ICE at -O2 and -O3

2011-10-30 Thread mattst88 at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49682

Matt Turner mattst88 at gmail dot com changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
  Known to work||4.6.2
 Resolution||FIXED

--- Comment #2 from Matt Turner mattst88 at gmail dot com 2011-10-31 03:48:14 
UTC ---
Indeed, this is fixed with 4.6.2.


[Bug target/36798] internal compiler error: in arm_expand_binop_builtin, at config/arm/arm.c:12548

2011-10-01 Thread mattst88 at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36798

Matt Turner mattst88 at gmail dot com changed:

   What|Removed |Added

 CC||mattst88 at gmail dot com

--- Comment #8 from Matt Turner mattst88 at gmail dot com 2011-10-01 15:24:46 
UTC ---
This is a duplicate of bug 35294.


[Bug target/35294] iwmmxt intrinsics, internal compiler error

2011-10-01 Thread mattst88 at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35294

--- Comment #14 from Matt Turner mattst88 at gmail dot com 2011-10-01 
15:26:10 UTC ---
Created attachment 25391
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=25391
[PATCH] Wire-up missing ARM iwmmxt intrinsics

Fixes it for me for gcc-4.6.1. Allows me to build an iwmmxt-optimized pixman
using the standard _mm_* intrinsics.


[Bug target/36966] arm iwmmxt builtin problem

2011-10-01 Thread mattst88 at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36966

--- Comment #4 from Matt Turner mattst88 at gmail dot com 2011-10-01 15:24:57 
UTC ---
This is a duplicate of bug 35294.


[Bug c/45941] Failed compile on Loongson2f

2011-08-14 Thread mattst88 at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45941

--- Comment #1 from Matt Turner mattst88 at gmail dot com 2011-08-15 04:33:57 
UTC ---
Created attachment 25010
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=25010
lto-streamer-in.i.bz2

Test case (preprocessed lto-streamer-in.c from gcc-4.5.3 sources).

$ mips64el-unknown-linux-gnu-gcc -O2 -mabi=64 -c lto-streamer-in.i 
mips64el-unknown-linux-gnu-gcc: Internal error: Segmentation fault (program
cc1)
Please submit a full bug report.
See http://bugs.gentoo.org/ for instructions.

Notes:
 - The segfault is reproducible with mips64el-unknown-linux-gnu-4.5.x
   (tested 4.5.2 and 4.5.3)
 - The segfault is not reproducible with 4.6.1.
 - The segfault only occurs with -O2 and -mabi=64. Other -O levels or -mabi=...
   flags do not trigger the crash, including '-O3 -mabi=64'
 - The segfault is not reproducible with [big endian] mips64-unknown-linux-gnu
  (tried 4.5.2, 4.5.3, and 4.6.1)

Filed as Gentoo bug https://bugs.gentoo.org/show_bug.cgi?id=378375


[Bug target/36966] arm iwmmxt builtin problem

2011-07-14 Thread mattst88 at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36966

Matt Turner mattst88 at gmail dot com changed:

   What|Removed |Added

 CC||mattst88 at gmail dot com

--- Comment #3 from Matt Turner mattst88 at gmail dot com 2011-07-14 21:18:56 
UTC ---
Confirmed with 4.4.5 and 4.5.2:

$ gcc -O2 -march=iwmmxt -c 36966.c -o 36966.c 
36966.c: In function 'foo':
36966.c:5: internal compiler error: in arm_expand_binop_builtin, at
config/arm/arm.c:16192
Please submit a full bug report,
with preprocessed source if appropriate.
See http://bugzilla.redhat.com/bugzilla for instructions.
Preprocessed source stored into /tmp/ccG40Hm6.out file, please attach this to
your bugreport.


$ armv5tel-unknown-linux-gnueabi-gcc -O2 -march=iwmmxt -c 36966.c 
36966.c: In function 'foo':
36966.c:5:31: internal compiler error: in arm_expand_binop_builtin, at
config/arm/arm.c:17895
Please submit a full bug report,
with preprocessed source if appropriate.
See http://bugs.gentoo.org/ for instructions.


[Bug target/47230] [4.6/4.7 Regression] gcc fails to bootstrap on alpha in stage2 with relocation truncated to fit: GPREL16 against ...

2011-07-09 Thread mattst88 at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47230

--- Comment #13 from Matt Turner mattst88 at gmail dot com 2011-07-09 
20:55:23 UTC ---
(In reply to comment #12)
 Since this is linker bug, I have added rth to CC in the hope that he has 
 better
 solution.

Bug link is http://sources.redhat.com/bugzilla/show_bug.cgi?id=5276


[Bug rtl-optimization/49682] New: [alpha] gcc-4.6.1: ICE at -O2 and -O3

2011-07-08 Thread mattst88 at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49682

   Summary: [alpha] gcc-4.6.1: ICE at -O2 and -O3
   Product: gcc
   Version: 4.6.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: matts...@gmail.com


Created attachment 24722
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=24722
preprocessed crl.i (from openssl-1.0.0d)

The attached preprocessed file from openssl-1.0.0d causes an internal compiler
error on Alpha with -O2 or -O3 optimization level.

# gcc -O2 -c crl.i 
crl.c: In function crl_main:
crl.c:403:2: internal compiler error: in ready_remove_first, at
haifa-sched.c:1414
Please submit a full bug report,
with preprocessed source if appropriate.
See http://bugs.gentoo.org/ for instructions.


[Bug target/47230] [4.6/4.7 Regression] gcc fails to bootstrap on alpha in stage2 with relocation truncated to fit: GPREL16 against ...

2011-07-08 Thread mattst88 at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47230

Matt Turner mattst88 at gmail dot com changed:

   What|Removed |Added

 CC||mattst88 at gmail dot com

--- Comment #11 from Matt Turner mattst88 at gmail dot com 2011-07-08 
20:42:37 UTC ---
gcc-4.6.1 fails to boostrap gcc-4.6.1 even with binutils-2.21.1 + fixes for
12608 and 12610 for me.


[Bug c++/45382] New: internal compiler error: tree code �call_expr� is not supported in gimple streams

2010-08-22 Thread mattst88 at gmail dot com
# gcc -v
Using built-in specs.
COLLECT_GCC=/usr/x86_64-pc-linux-gnu/gcc-bin/4.5.1/gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-pc-linux-gnu/4.5.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /var/tmp/portage/sys-devel/gcc-4.5.1/work/gcc-4.5.1/configure
--prefix=/usr --bindir=/usr/x86_64-pc-linux-gnu/gcc-bin/4.5.1
--includedir=/usr/lib/gcc/x86_64-pc-linux-gnu/4.5.1/include
--datadir=/usr/share/gcc-data/x86_64-pc-linux-gnu/4.5.1
--mandir=/usr/share/gcc-data/x86_64-pc-linux-gnu/4.5.1/man
--infodir=/usr/share/gcc-data/x86_64-pc-linux-gnu/4.5.1/info
--with-gxx-include-dir=/usr/lib/gcc/x86_64-pc-linux-gnu/4.5.1/include/g++-v4
--host=x86_64-pc-linux-gnu --build=x86_64-pc-linux-gnu --disable-altivec
--disable-fixed-point --with-ppl --with-cloog --enable-lto --enable-nls
--without-included-gettext --with-system-zlib --disable-werror
--enable-secureplt --enable-multilib --enable-libmudflap --disable-libssp
--disable-libgomp --enable-cld
--with-python-dir=/share/gcc-data/x86_64-pc-linux-gnu/4.5.1/python
--enable-checking=release --disable-libgcj --enable-languages=c,c++
--enable-shared --enable-threads=posix --enable-__cxa_atexit
--enable-clocale=gnu --with-bugurl=http://bugs.gentoo.org/
--with-pkgversion='Gentoo 4.5.1 p1.0, pie-0.4.5'
Thread model: posix
gcc version 4.5.1 (Gentoo 4.5.1 p1.0, pie-0.4.5)

Building the attached preprocessed file, I can cause an ICE
# g++ -flto -c SmallStrings.i -o SmallStrings.o
JavaScriptCore/runtime/SmallStrings.cpp:141:1: internal compiler error: tree
code ‘call_expr’ is not supported in gimple streams
Please submit a full bug report,
with preprocessed source if appropriate.
See http://bugs.gentoo.org/ for instructions.

The file is from webkit-gtk-1.2.3.


-- 
   Summary: internal compiler error: tree code ‘call_expr’ is not
supported in gimple streams
   Product: gcc
   Version: 4.5.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: mattst88 at gmail dot com
 GCC build triplet: x86_64-pc-linux-gnu
  GCC host triplet: x86_64-pc-linux-gnu
GCC target triplet: x86_64-pc-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45382



[Bug c++/45382] internal compiler error: tree code �call_expr� is not supported in gimple streams

2010-08-22 Thread mattst88 at gmail dot com


--- Comment #1 from mattst88 at gmail dot com  2010-08-22 22:01 ---
Created an attachment (id=21546)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=21546action=view)
SmallStrings.i from webkit-gtk-1.2.3


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45382



[Bug c++/43850] ice: tree code �template_type_parm� is not supported in gimple streams

2010-08-22 Thread mattst88 at gmail dot com


--- Comment #9 from mattst88 at gmail dot com  2010-08-22 22:02 ---
Can this code be backported to the 4.5 branch?


-- 

mattst88 at gmail dot com changed:

   What|Removed |Added

 CC||mattst88 at gmail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43850



[Bug rtl-optimization/44123] New: gcc produces bad code at -O1

2010-05-13 Thread mattst88 at gmail dot com
 de 23 lda sp,16(sp)
  44:   01 80 fa 6b ret

# gcc-4.4.3 -O2 -mcpu=ev67 -c z.c  objdump -d z.o

z.o: file format elf64-alpha

Disassembly of section .text:

 __ffs:
   0:   60 06 f0 73 cttza0,v0
   4:   01 80 fa 6b ret
   8:   1f 04 ff 47 nop
   c:   00 00 fe 2f unop

0010 foo:
  10:   00 00 bb 27 ldahgp,0(t12)
  14:   00 00 bd 23 lda gp,0(gp)
  18:   f0 ff de 23 lda sp,-16(sp)
  1c:   08 00 30 a4 ldq t0,8(a0)
  20:   08 00 3e b5 stq s0,8(sp)
  24:   00 00 30 a5 ldq s0,0(a0)
  28:   00 00 5e b7 stq ra,0(sp)
  2c:   10 04 e1 47 mov t0,a0
  30:   d0 04 29 45 cmovne  s0,s0,a0
  34:   a9 15 20 41 cmpeq   s0,0,s0
  38:   29 d7 20 49 sll s0,0x6,s0
  3c:   00 00 40 d3 bsr ra,40 foo+0x30
  40:   00 04 20 41 addqs0,v0,v0
  44:   00 00 5e a7 ldq ra,0(sp)
  48:   08 00 3e a5 ldq s0,8(sp)
  4c:   10 00 de 23 lda sp,16(sp)
  50:   01 80 fa 6b ret
  54:   00 00 fe 2f unop
  58:   1f 04 ff 47 nop
  5c:   00 00 fe 2f unop

# gcc-4.4.3 -O3 -mcpu=ev67 -c z.c  objdump -d z.o

z.o: file format elf64-alpha

Disassembly of section .text:

 __ffs:
   0:   60 06 f0 73 cttza0,v0
   4:   01 80 fa 6b ret
   8:   1f 04 ff 47 nop
   c:   00 00 fe 2f unop

0010 foo:
  10:   00 00 30 a4 ldq t0,0(a0)
  14:   08 00 10 a4 ldq v0,8(a0)
  18:   c0 04 21 44 cmovne  t0,t0,v0
  1c:   a1 15 20 40 cmpeq   t0,0,t0
  20:   21 d7 20 48 sll t0,0x6,t0
  24:   60 06 e0 73 cttzv0,v0
  28:   00 04 01 40 addqv0,t0,v0
  2c:   01 80 fa 6b ret

# gcc-4.5.0 -Os -mcpu=ev67 -c z.c  objdump -d z.o

z.o: file format elf64-alpha

Disassembly of section .text:

 __ffs:
   0:   60 06 f0 73 cttza0,v0
   4:   01 80 fa 6b ret

0008 foo:
   8:   00 00 30 a4 ldq t0,0(a0)
   c:   08 00 10 a4 ldq v0,8(a0)
  10:   c0 04 21 44 cmovne  t0,t0,v0
  14:   a1 15 20 40 cmpeq   t0,0,t0
  18:   21 d7 20 48 sll t0,0x6,t0
  1c:   60 06 e0 73 cttzv0,v0
  20:   00 04 01 40 addqv0,t0,v0
  24:   01 80 fa 6b ret

# gcc-4.5.0 -O1 -mcpu=ev67 -c z.c  objdump -d z.o

z.o: file format elf64-alpha

Disassembly of section .text:

 __ffs:
   0:   60 06 f0 73 cttza0,v0
   4:   01 80 fa 6b ret

0008 foo:
   8:   00 00 bb 27 ldahgp,0(t12)
   c:   00 00 bd 23 lda gp,0(gp)
  10:   f0 ff de 23 lda sp,-16(sp)
  14:   00 00 5e b7 stq ra,0(sp)
  18:   08 00 3e b5 stq s0,8(sp)
  1c:   00 00 30 a4 ldq t0,0(a0)
  20:   08 00 10 a6 ldq a0,8(a0)
  24:   a9 15 20 40 cmpeq   t0,0,s0
  28:   29 d7 20 49 sll s0,0x6,s0
  2c:   d0 04 21 44 cmovne  t0,t0,a0
  30:   00 00 40 d3 bsr ra,34 foo+0x2c
  34:   00 04 20 41 addqs0,v0,v0
  38:   00 00 5e a7 ldq ra,0(sp)
  3c:   08 00 3e a5 ldq s0,8(sp)
  40:   10 00 de 23 lda sp,16(sp)
  44:   01 80 fa 6b ret

# gcc-4.5.0 -O2 -mcpu=ev67 -c z.c  objdump -d z.o

z.o: file format elf64-alpha

Disassembly of section .text:

 __ffs:
   0:   60 06 f0 73 cttza0,v0
   4:   01 80 fa 6b ret
   8:   1f 04 ff 47 nop
   c:   00 00 fe 2f unop

0010 foo:
  10:   00 00 30 a4 ldq t0,0(a0)
  14:   08 00 10 a4 ldq v0,8(a0)
  18:   c0 04 21 44 cmovne  t0,t0,v0
  1c:   a1 15 20 40 cmpeq   t0,0,t0
  20:   21 d7 20 48 sll t0,0x6,t0
  24:   60 06 e0 73 cttzv0,v0
  28:   00 04 01 40 addqv0,t0,v0
  2c:   01 80 fa 6b ret

# gcc-4.5.0 -O3 -mcpu=ev67 -c z.c  objdump -d z.o

z.o: file format elf64-alpha

Disassembly of section .text:

 __ffs:
   0:   60 06 f0 73 cttza0,v0
   4:   01 80 fa 6b ret
   8:   1f 04 ff 47 nop
   c:   00 00 fe 2f unop

0010 foo:
  10:   00 00 30 a4 ldq t0,0(a0)
  14:   08 00 10 a4 ldq v0,8(a0)
  18:   c0 04 21 44 cmovne  t0,t0,v0
  1c:   a1 15 20 40 cmpeq   t0,0,t0
  20:   21 d7 20 48 sll t0,0x6,t0
  24:   60 06 e0 73 cttzv0,v0
  28:   00 04 01 40 addqv0,t0,v0
  2c:   01 80 fa 6b ret

4.3.4 -Os: good
4.3.4 -O1: bad
4.3.4 -O2: good
4.3.4 -O3: good

4.4.3 -Os: bad
4.4.3 -O1: bad
4.4.3 -O2: bad
4.4.3 -O3: good

4.5.0 -Os: good
4.5.0 -O1: bad
4.5.0 -O2: good
4.5.0 -O3: good


-- 
   Summary: gcc produces bad code at -O1
   Product: gcc
   Version: 4.5.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: mattst88 at gmail dot com
 GCC build triplet: alpha-unknown-linux-gnu
  GCC host triplet: alpha-unknown-linux-gnu
GCC target triplet: alpha-unknown-linux-gnu


http

[Bug rtl-optimization/44123] gcc produces poor code at -O1

2010-05-13 Thread mattst88 at gmail dot com


--- Comment #2 from mattst88 at gmail dot com  2010-05-13 21:40 ---
(In reply to comment #1)
 What do you mean by bad?  If the code isn't correct, wrong is better
 suited; if it is suboptimal, poor is better suited.
 
 If the latter, it's expected that -O1 generates poorer code than -O2/-O3/-Os.

Yes, poor is a better word.

And by poor, I mean that gcc produces many superfluous loads and stores and
even a branch.


-- 

mattst88 at gmail dot com changed:

   What|Removed |Added

 Status|WAITING |UNCONFIRMED
Summary|gcc produces bad code at -O1|gcc produces poor code at -
   ||O1


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44123



[Bug c/43691] New: Code segfault when compiled with -Os, -O2, or -O3

2010-04-08 Thread mattst88 at gmail dot com
When this testcase, using inline assembly, is compiled with -Os, -O2, or -O3 it
segfaults. -O0 and -O1 allow it to run correctly.

Moving the inline assembly into a separate file and including it in the
compilation allow the program to run correctly at all -O levels.

My results are
gcc -O0 -mcpu=ev67 -D__FAIL rewritten.S test.c -o test  ./test # works
gcc -O1 -mcpu=ev67 -D__FAIL rewritten.S test.c -o test  ./test # works
gcc -Os -mcpu=ev67 -D__FAIL rewritten.S test.c -o test  ./test # segfault
gcc -O2 -mcpu=ev67 -D__FAIL rewritten.S test.c -o test  ./test # segfault
gcc -O3 -mcpu=ev67 -D__FAIL rewritten.S test.c -o test  ./test # segfault

Compiling without -D__FAIL causes the external assembly in rewritten.S to be
used. Without -D__FAIL, the program runs correctly at all -O levels.


-- 
   Summary: Code segfault when compiled with -Os, -O2, or -O3
   Product: gcc
   Version: 4.4.3
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: mattst88 at gmail dot com
 GCC build triplet: alpha-unknown-linux-gnu
  GCC host triplet: alpha-unknown-linux-gnu
GCC target triplet: alpha-unknown-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43691



[Bug c/43691] Code segfault when compiled with -Os, -O2, or -O3

2010-04-08 Thread mattst88 at gmail dot com


--- Comment #1 from mattst88 at gmail dot com  2010-04-08 16:50 ---
Created an attachment (id=20337)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20337action=view)
rewritten.S - external assembly


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43691



[Bug c/43691] Code segfault when compiled with -Os, -O2, or -O3

2010-04-08 Thread mattst88 at gmail dot com


--- Comment #2 from mattst88 at gmail dot com  2010-04-08 16:50 ---
Created an attachment (id=20338)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20338action=view)
test.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43691



[Bug c/43691] Code segfault when compiled with -Os, -O2, or -O3

2010-04-08 Thread mattst88 at gmail dot com


--- Comment #7 from mattst88 at gmail dot com  2010-04-08 17:53 ---
(In reply to comment #4)
 (In reply to comment #0)
  When this testcase, using inline assembly, is compiled with -Os, -O2, or -O3
  it segfaults. -O0 and -O1 allow it to run correctly.
  
  Moving the inline assembly into a separate file and including it in the
  compilation allow the program to run correctly at all -O levels.
 
 From these symptoms, it is practically certain that you have done something
 wrong with the asm inputs and outputs.  I don't have an Alpha compiler to 
 hand,
 but just from looking at your code, I bet it will work correctly if you 
 rewrite
 it like so:
 
 unsigned long rewritten(const unsigned long b[2]) {
 unsigned long ofs, output;
 
 asm(
 cmoveq %0,64,%1# ofs= (b[0] ? ofs : 64);\n
 cmoveq %0,%2,%0# temp   = (b[0] ? b[0] : b[1]);\n
 cttz   %0,%0   # output = cttz(temp);\n
 : =r (output), =r (ofs)
 : r (b[1]), 0 (b[0]), 1 (0)
 );
 return output + ofs;
 }

Yep, your code works.

 (I've assumed that the semantic of cmoveq a,b,c is if (a==0) c=b;)
 
 The trick with asm() is to do as little as possible.  I assume that the reason
 the assembly version beats the pure-C version is the cmoveq's, so I stripped
 the setup code and the addition.  This allows me to express the _real_ 
 argument
 constraints rather than fake ones, which lets me be confident that the
 optimizers will do what you want.  Note that this also means volatile is
 unnecessary.
 
 As a general principle, if you find yourself writing an asm() with a big long
 list of earlyclobber outputs but no inputs, you are doing it wrong.
 

Thanks a ton for the advice. You knocked that out of the water.

Marking as INVALID.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43691



[Bug target/42113] [4.3/4.4/4.5 Regression] Internal Compiler error with -O3, breaking commit known

2009-11-22 Thread mattst88 at gmail dot com


--- Comment #9 from mattst88 at gmail dot com  2009-11-22 17:52 ---
WRT the test suite: should it be 

/* { dg-options -O2 } */

or

/* { dg-options -O3 } */

That is, -O2 or -O3? I could only produce the internal compiler error with -O3,
and not at all with -Os, -O0, -O1, -O2.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42113



[Bug target/42113] [4.3/4.4/4.5 Regression] Internal Compiler error with -O3, breaking commit known

2009-11-21 Thread mattst88 at gmail dot com


--- Comment #7 from mattst88 at gmail dot com  2009-11-21 16:15 ---
I can confirm that the attached patch fixes the issue. Thanks!


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42113



[Bug regression/42113] New: [4.3/4.4/4.5 Regression] Internal Compiler error with -O3, breaking commit known

2009-11-19 Thread mattst88 at gmail dot com
Unfortunately, but 8603 seems to cause internal compiler errors on select files
when using -O3. -O{s,0,1,2} are unaffected.

I'm attaching pp.i (preprocessed pp.c from libperl), and flist.i (preprocessed
flist.c from rsync) as test cases.

gcc-4.3.4 does not exhibit this failure, but when patched from bug 8603, it
fails. gcc-4.4.1 does not exhibit this failure, but gcc-4.4.2, which includes
the patch from 8603, does fail.


-- 
   Summary: [4.3/4.4/4.5 Regression] Internal Compiler error with -
O3, breaking commit known
   Product: gcc
   Version: 4.5.0
Status: UNCONFIRMED
  Severity: major
  Priority: P3
 Component: regression
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: mattst88 at gmail dot com
 GCC build triplet: alpha-unknown-linux-gnu
  GCC host triplet: alpha-unknown-linux-gnu
GCC target triplet: alpha-unknown-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42113



[Bug regression/42113] [4.3/4.4/4.5 Regression] Internal Compiler error with -O3, breaking commit known

2009-11-19 Thread mattst88 at gmail dot com


--- Comment #1 from mattst88 at gmail dot com  2009-11-20 00:44 ---
Created an attachment (id=19061)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=19061action=view)
Test Case 1 - pp.i - preprocessed pp.c from libperl


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42113



[Bug regression/42113] [4.3/4.4/4.5 Regression] Internal Compiler error with -O3, breaking commit known

2009-11-19 Thread mattst88 at gmail dot com


--- Comment #2 from mattst88 at gmail dot com  2009-11-20 00:45 ---
Created an attachment (id=19062)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=19062action=view)
Test Case 2 - flist.i - preprocessed flist.c from rsync


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42113



[Bug target/8603] [Alpha] s?addl pattern doesn't work

2009-08-10 Thread mattst88 at gmail dot com


--- Comment #6 from mattst88 at gmail dot com  2009-08-11 02:38 ---
To show how worthwhile this trivial patch is -- the following table shows the
number of times s{4,8}{add,sub}l are used in building the Linux kernel
(2.6.31-rc5) with unpatched and patched gcc (4.3.4).

unpatched   patched
s4addl  53  395
s8addl  79  132
s4subl  0   111
s8subl  0   35

This patch also causes gcc to produce exactly the same output as Compaq's C
compiler (this is a good thing!) for the two test cases given in the report.

For example --
Test case:
UP1500 gcc-tests # cat s_addl.c 
int f(int x, int y) { return 4 * x + y; }
int g(int x) { return 3 * x; }

Results with unpatched gcc-4.3.x

UP1500 gcc-tests # gcc-unpatched -O3 -mcpu=ev67 -c s_addl.c 
UP1500 gcc-tests # objdump -d s_addl.o 

s_addl.o: file format elf64-alpha

Disassembly of section .text:

 f:
   0:   40 04 11 42 s4addq  a0,a1,v0
   4:   00 00 e0 43 sextl   v0,v0 -- unnecessary
   8:   01 80 fa 6b ret
   c:   00 00 fe 2f unop

0010 g:
  10:   60 05 10 42 s4subq  a0,a0,v0
  14:   00 00 e0 43 sextl   v0,v0 -- unnecessary
  18:   01 80 fa 6b ret
  1c:   00 00 fe 2f unop

Results with patched gcc-4.3.x

UP1500 gcc-tests # gcc-patched -O3 -mcpu=ev67 -c s_addl.c 
UP1500 gcc-tests # objdump -d s_addl.o 

s_addl.o: file format elf64-alpha

Disassembly of section .text:

 f:
   0:   40 00 11 42 s4addl  a0,a1,v0
   4:   01 80 fa 6b ret
   8:   1f 04 ff 47 nop
   c:   00 00 fe 2f unop

0010 g:
  10:   60 01 10 42 s4subl  a0,a0,v0
  14:   01 80 fa 6b ret
  18:   1f 04 ff 47 nop
  1c:   00 00 fe 2f unop

Results with Compaq C compiler (what we're trying to replicate)

UP1500 gcc-tests # ccc -fast -host -c s_addl.c 
UP1500 gcc-tests # objdump -d s_addl.o 

s_addl.o: file format elf64-alpha

Disassembly of section .text:

 f:
   0:   40 00 11 42 s4addl  a0,a1,v0
   4:   01 80 fa 6b ret
   8:   00 00 fe 2f unop
   c:   00 00 fe 2f unop

0010 g:
  10:   60 01 10 42 s4subl  a0,a0,v0
  14:   01 80 fa 6b ret


Please add to gcc-4.3.x and gcc-4.4.x.


-- 

mattst88 at gmail dot com changed:

   What|Removed |Added

 CC||mattst88 at gmail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=8603



[Bug rtl-optimization/27468] sign-extending Alpha instructions not exploited

2009-04-18 Thread mattst88 at gmail dot com


--- Comment #2 from mattst88 at gmail dot com  2009-04-18 15:22 ---
For reference, here's what the Compaq C compiler generates for each of these.

(In reply to comment #0)
 The sign-extending Alpha instructions like addl are sometimes not used. I
 don't know whether the SEE pass is supposed to affect this, or whether it
 is something a combiner pass should do...


Compaq C:
 f5:
   0:   10 00 f0 43 sextl   a0,a0
   4:   20 01 f0 43 negla0,v0
   8:   c0 08 10 46 cmovge  a0,a0,v0
   c:   00 00 fe 2f unop
  10:   01 80 fa 6b ret

 #include stdlib.h
 
 /* gcc 4.2.0 20060506:
 negqa0,v0
 cmovge  a0,a0,v0
 sextl   v0,v0
optimal:
 negla0,v0
 cmovge  a0,a0,v0  */   
 int f5(int x) {
 return abs(x);
 }
 
 

Compaq C:
 f23:
   0:   30 17 06 4a sll a0,0x30,a0
   4:   90 17 06 4a sra a0,0x30,a0
   8:   62 05 10 42 s4subq  a0,a0,t1
   c:   42 06 50 40 s8addq  t1,a0,t1
  10:   42 06 50 40 s8addq  t1,a0,t1
  14:   40 06 50 40 s8addq  t1,a0,v0
  18:   01 80 fa 6b ret

 /* gcc 4.2.0 20060506:
 s4addq  a0,a0,v0
 s4addq  v0,v0,v0
 s8addq  v0,a0,v0
 s8addq  v0,a0,v0 #
 sextl   v0,v0# can be combined to s8addl  v0,a0,v0 */
 int64_t f23(int16_t x) {
 return 1609 * x;
 }
 
 

Compaq C:
 f49:
   0:   00 80 5f 24 ldaht1,-32768
   4:   00 80 7f 24 ldaht2,-32768
   8:   02 08 02 46 xor a0,t1,t1
   c:   00 00 43 40 addlt1,t2,v0
  10:   01 80 fa 6b ret

 /* gcc 4.2.0 20060506:
ldaht0,-32768
xor a0,t0,v0
addqv0,t0,v0 #
sextl   v0,v0# can be combined to addlv0,t0,v0 */
 unsigned f49(unsigned val) {
 return (val ^ 0x8000) - 0x8000;
 }
 

In the first two cases, the Compaq C compiler seems to be more careful about
input arguments.


-- 

mattst88 at gmail dot com changed:

   What|Removed |Added

 CC||mattst88 at gmail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27468



[Bug rtl-optimization/27469] zero extension not eliminated

2009-04-18 Thread mattst88 at gmail dot com


--- Comment #2 from mattst88 at gmail dot com  2009-04-18 15:25 ---
(In reply to comment #0)
 unsigned parity(unsigned x) {
 x ^= x  16;
 x ^= x  8;
 x ^= x  4;
 x = 0xf;
 return (0x6996  x)  1;
 }
 
 gcc 4.2.0 20060506 produces:
 extwl   a0,0x2,t2
 lda v0,27030
 xor t2,a0,t2
 zapnot  t2,0xf,t1 # redundant zero-extension
 srl t1,0x8,t1
 xor t1,t2,t1
 zapnot  t1,0xf,t0 # redundant zero-extension
 srl t0,0x4,t0
 xor t0,t1,t0
 and t0,0xf,t0
 sra v0,t0,v0
 and v0,0x1,v0
 
 -fsee doesn't change anything here.
 

Compaq C generates:
 parity:
   0:   82 16 02 4a srl a0,0x10,t1
   4:   96 69 bf 20 lda t4,27030
   8:   02 08 02 46 xor a0,t1,t1
   c:   83 16 41 48 srl t1,0x8,t2
  10:   02 08 43 44 xor t1,t2,t1
  14:   84 96 40 48 srl t1,0x4,t3
  18:   02 08 44 44 xor t1,t3,t1
  1c:   02 f0 41 44 and t1,0xf,t1
  20:   82 07 a2 48 sra t4,t1,t1
  24:   00 30 40 44 and t1,0x1,v0
  28:   01 80 fa 6b ret


-- 

mattst88 at gmail dot com changed:

   What|Removed |Added

 CC||mattst88 at gmail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27469