[Bug target/34571] [4.3 Regression] Segfault in alpha_expand_mov at -O3
--- Comment #3 from rask at gcc dot gnu dot org 2007-12-26 17:25 --- Created an attachment (id=14832) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14832action=view) patch v1 (gdb) frame 1 #1 0x0096e248 in gen_movdi (operand0=0x2ac241e195e0, operand1=0x2ac241de1a10) at /n/12/rask/src/all/gcc/config/alpha/alpha.md:5704 (gdb) call debug_rtx(operand0) (reg:DI 2 $2) (gdb) call debug_rtx(operand1) (const:DI (plus:DI (label_ref:DI 70) (const_int 24 [0x18]))) In alpha_expand_mov(), we end up calling force_const_mem() because this is not a valid symbolic operand, but alpha_cannot_force_const_mem() thinks it is, so we end up with NULL_RTX and a segfault. Additionally there is a subtle little bug in varasm.c. Martin, please try this patch. Also please bootstrap and test it for regressions if you have the time, because I don't have any Alpha hardware to do so myself. -- rask at gcc dot gnu dot org changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |rask at gcc dot gnu dot org |dot org | Status|UNCONFIRMED |ASSIGNED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34571
[Bug target/8835] [mcore-elf] bootstrap ICE at expr.c:2771
--- Comment #20 from rask at gcc dot gnu dot org 2007-12-21 21:56 --- Fixed as of revision 131125 on mainline and not a 4.1/4.2 regression. -- rask at gcc dot gnu dot org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Known to fail|4.3.0 | Known to work||4.3.0 Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=8835
[Bug target/8835] [mcore-elf] bootstrap ICE at expr.c:2771
--- Comment #19 from rask at gcc dot gnu dot org 2007-12-21 21:53 --- Subject: Bug 8835 Author: rask Date: Fri Dec 21 21:53:23 2007 New Revision: 131125 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=131125 Log: 2007-12-13 Andrew Pinski [EMAIL PROTECTED] Rask Ingemann Lambertsen [EMAIL PROTECTED] PR target/8835 * config/mcore/mcore.c (mcore_function_value): Call promote_mode instead of PROMOTE_MODE. testsuite/ 2007-12-13 Kazu Hirata [EMAIL PROTECTED] PR target/8835 * gcc.dg/pr8835-1.c: New. Added: trunk/gcc/testsuite/gcc.dg/pr8835-1.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/mcore/mcore.c trunk/gcc/testsuite/ChangeLog -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=8835
[Bug target/34210] ffs builtin calls undefined __ffshi2
--- Comment #3 from rask at gcc dot gnu dot org 2007-12-21 22:20 --- You want something like this in libgcc/config/avr/t-avr to get the 16-bit versions: # Extra 16-bit integer functions. intfuncs16 = _absvXX2 _addvXX3 _subvXX3 _mulvXX3 _negvXX2 _ffsXX2 _clzXX2 \ _ctzXX2 _popcountXX2 hiintfuncs16 = $(subst XX,hi,$(intfuncs16)) siintfuncs16 = $(subst XX,si,$(intfuncs16)) iter-items := $(hiintfuncs16) iter-labels := $(siintfuncs16) iter-sizes := $(patsubst %,2,$(siintfuncs16)) $(patsubst %,2,$(hiintfuncs16)) include $(srcdir)/empty.mk $(patsubst %,$(srcdir)/siditi-object.mk,$(iter-items)) libgcc-objects += $(patsubst %,%$(objext),$(hiintfuncs16)) ifeq ($(enable_shared),yes) libgcc-s-objects += $(patsubst %,%_s$(objext),$(hiintfuncs16)) endif -- rask at gcc dot gnu dot org changed: What|Removed |Added Component|c |target http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34210
[Bug rtl-optimization/6585] Redundant store/load instruction pairs on ix86
--- Comment #18 from rask at gcc dot gnu dot org 2007-12-19 14:15 --- Created an attachment (id=14795) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14795action=view) (u)mulsidi3 patch This patch (in testing) improves the register allocation, removing the last redundant movl instructions: mul: pushl %ebx# 40*pushsi2[length = 1] movl8(%esp), %ebx # 28*movsi_1/1 [length = 4] movl16(%esp), %eax # 30*movsi_1/1 [length = 4] movl20(%esp), %ecx # 37*movsi_1/1 [length = 4] movl12(%esp), %edx # 38*movsi_1/1 [length = 4] imull %ebx, %ecx # 7 *mulsi3_1/3 [length = 3] imull %eax, %edx # 8 *mulsi3_1/3 [length = 3] addl%edx, %ecx # 9 *addsi_1/1 [length = 2] mull%ebx# 33*umulsidi3 [length = 2] popl%ebx# 43popsi1 [length = 1] leal(%ecx,%edx), %edx # 39*lea_1 [length = 3] ret # 44return_internal [length = 1] -- rask at gcc dot gnu dot org changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |rask at gcc dot gnu dot org |dot org | Status|RESOLVED|ASSIGNED Resolution|DUPLICATE | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=6585
[Bug target/33474] bfin: ICE: RTL check: expected code 'set' or 'clobber', have 'parallel' in bfin_adjust_cost, at config/bfin/bfin.c:3120
--- Comment #4 from rask at gcc dot gnu dot org 2007-12-18 15:31 --- Subject: Bug 33474 Author: rask Date: Tue Dec 18 15:30:57 2007 New Revision: 131037 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=131037 Log: PR target/33474 * config/bfin/bfin.c (bfin_adjust_cost): Dig into PARALLELs to find the SET. Modified: trunk/gcc/ChangeLog trunk/gcc/config/bfin/bfin.c -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33474
[Bug target/33474] bfin: ICE: RTL check: expected code 'set' or 'clobber', have 'parallel' in bfin_adjust_cost, at config/bfin/bfin.c:3120
--- Comment #5 from rask at gcc dot gnu dot org 2007-12-18 15:58 --- Fixed with revision 131037. -- rask at gcc dot gnu dot org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Known to fail|4.3.0 | Known to work||4.3.0 Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33474
[Bug middle-end/34226] [4.3 Regression][frv] ICE in default_secondary_reload, at targhooks.c:612
--- Comment #7 from rask at gcc dot gnu dot org 2007-12-16 12:55 --- It's the dataflow merge (125624) that broke it. Revision 125623 with 130333 on top is fine, but 125624 with 125851 (so it builds) and 130333 on top fails. The patch in comment #6 has one testsuite regression on arm-unknown-elf: FAIL: gcc.dg/struct/wo_prof_malloc_size_var.c (internal compiler error) FAIL: gcc.dg/struct/wo_prof_malloc_size_var.c (test for excess errors) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34226
[Bug middle-end/34226] [4.3 Regression][frv] ICE in default_secondary_reload, at targhooks.c:612
--- Comment #9 from rask at gcc dot gnu dot org 2007-12-16 13:39 --- That is not a regression Just an already existing failure. Unmodified trunk revision 130944 doesn't have it: diff -u build/arm-unknown-elf-results-unpatched/summary build/arm-unknown-elf-results-patched/summary --- build/arm-unknown-elf-results-unpatched/summary 2007-12-15 02:26:10.0 +0100 +++ build/arm-unknown-elf-results-patched/summary 2007-12-16 04:17:59.0 +0100 @@ -17,6 +17,9 @@ FAIL: gcc.dg/memcpy-1.c scan-tree-dump-times optimized nasty_local 0 FAIL: gcc.dg/pr30957-1.c scan-rtl-dump loop2_unroll Expanding Accumulator FAIL: gcc.dg/var-expand1.c scan-rtl-dump loop2_unroll Expanding Accumulator +FAIL: gcc.dg/struct/wo_prof_malloc_size_var.c (internal compiler error) +FAIL: gcc.dg/struct/wo_prof_malloc_size_var.c (test for excess errors) +UNRESOLVED: gcc.dg/struct/wo_prof_malloc_size_var.c compilation failed to produce executable FAIL: gcc.dg/struct/wo_prof_single_str_global.c execution test FAIL: gcc.dg/struct/wo_prof_single_str_local.c execution test FAIL: gcc.dg/struct/wo_prof_single_str_pointer.c execution test -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34226
[Bug middle-end/34226] [4.3 Regression][frv] ICE in default_secondary_reload, at targhooks.c:612
--- Comment #10 from rask at gcc dot gnu dot org 2007-12-16 15:18 --- That is not a regression Just an already existing failure. OK, looks like an intermittent failure which happens about one out of four times. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34226
[Bug target/34452] Multiply-by-constant pessimation
--- Comment #3 from rask at gcc dot gnu dot org 2007-12-13 22:01 --- In reply to comment #2 from Torbjorn Granlund: Another fix would perhaps be to teach synth_mult to understand that it's generating code for a 2.5 operand machine (one that can only do a x= b, not a = b x c, for some operation x). We should teach it that there will be moves inserted for sequences that rely on a source register twice (more or less). The root cause of the problem is that ix86_rtx_cost() doesn't get an insn (or at least a SET expression) to look at and therefore can't check the operands and adjust the cost estimate accordingly. -- rask at gcc dot gnu dot org changed: What|Removed |Added Keywords||missed-optimization http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34452
[Bug target/34436] Illegal assembly on ARM/Thumb
--- Comment #5 from rask at gcc dot gnu dot org 2007-12-12 00:31 --- See also bug 18560. I.e. consider using a C version instead. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34436
[Bug target/34412] ICE in extract_insn, at recog.c:1990
--- Comment #3 from rask at gcc dot gnu dot org 2007-12-10 20:42 --- Broken prologue expander, notice the mode mismatch: (plus:QI (reg/f:HI -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34412
[Bug target/8835] [mcore-elf] bootstrap ICE at expr.c:2771
--- Comment #18 from rask at gcc dot gnu dot org 2007-12-10 21:03 --- Mainline revision 130699: /n/11/rask/build/gcc-mcore-unknown-elf/mcore-unknown-elf/libstdc++-v3/include/iomanip: In function 'std::_Setfill_CharT std::setfill(_CharT) [with _CharT = char]': /n/11/rask/build/gcc-mcore-unknown-elf/mcore-unknown-elf/libstdc++-v3/include/iomanip:175: internal compiler error: in emit_move_insn, at expr.c:3366 -- rask at gcc dot gnu dot org changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |rask at gcc dot gnu dot org |dot org | Status|NEW |ASSIGNED Known to fail||4.3.0 Last reconfirmed|2005-07-01 04:36:32 |2007-12-10 21:03:27 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=8835
[Bug middle-end/34349] Internal Compiler Error (fatal)
--- Comment #1 from rask at gcc dot gnu dot org 2007-12-05 13:50 --- *** This bug has been marked as a duplicate of 33777 *** -- rask at gcc dot gnu dot org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Component|c |middle-end Resolution||DUPLICATE http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34349
[Bug middle-end/33777] Crash during a build of zsh
--- Comment #5 from rask at gcc dot gnu dot org 2007-12-05 13:50 --- *** Bug 34349 has been marked as a duplicate of this bug. *** -- rask at gcc dot gnu dot org changed: What|Removed |Added CC||henman at tech dot email dot ||ne dot jp http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33777
[Bug target/34350] [bfin]: ICE: in legitimize_pic_address, at config/bfin/bfin.c:325
--- Comment #2 from rask at gcc dot gnu dot org 2007-12-05 17:00 --- And the configure arguments: --target bfin-unknown-elf --enable-checking=yes,rtl --with-newlib --enable-sim --disable-gdb --disable-nls --disable-libffi --disable-target-libffi --disable-boehm-gc --disable-target-boehm-gc --without-x -- rask at gcc dot gnu dot org changed: What|Removed |Added Known to fail||4.3.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34350
[Bug target/34350] New: [bfin]: ICE: in legitimize_pic_address, at config/bfin/bfin.c:325
Revision 130561 fails to build libstdc++: gcc/xgcc -Bgcc/ -S -o /dev/null -O2 -msep-data /tmp/complex_io.cc /home/rask/build/gcc-bfin-unknown-elf/bfin-unknown-elf/msep-data/libstdc++-v3/include/complex: In function 'std::basic_ostream_CharT, _Traits std::operator(std::basic_ostream_CharT, _Traits, const std::complex_Tp) [with _Tp = double, _CharT = char, _Traits = std::char_traitschar]': /home/rask/build/gcc-bfin-unknown-elf/bfin-unknown-elf/msep-data/libstdc++-v3/include/complex:528: internal compiler error: in legitimize_pic_address, at config/bfin/bfin.c:325 (gdb) bt #0 internal_error (gmsgid=0xc38bc7 in %s, at %s:%d) at /n/12/rask/src/all/gcc/diagnostic.c:600 #1 0x00647fac in fancy_abort (file=value optimized out, line=325, function=0xcab0a0 legitimize_pic_address) at /n/12/rask/src/all/gcc/diagnostic.c:660 #2 0x00a4a068 in legitimize_pic_address (orig=value optimized out, reg=0x0, picreg=0x2ae99d869d40) at /n/12/rask/src/all/gcc/config/bfin/bfin.c:325 #3 0x00a4a369 in emit_pic_move (operands=0x7fff0d80e280, mode=value optimized out) at /n/12/rask/src/all/gcc/config/bfin/bfin.c:1964 #4 0x00a4a55f in expand_move (operands=0x7fff0d80e280, mode=SImode) at /n/12/rask/src/all/gcc/config/bfin/bfin.c:1979 #5 0x00a8d2d8 in gen_movsi (operand0=0x2ae99e9ddde0, operand1=0x2ae99e9ad840) at /n/12/rask/src/all/gcc/config/bfin/bfin.md:706 #6 0x006b6756 in emit_move_insn_1 (x=0x2ae99e9ddde0, y=0x2ae99e9ad840) at /n/12/rask/src/all/gcc/expr.c:3179 #7 0x0079c714 in gen_move_insn (x=0x2ae99e9ddde0, y=0x2ae99e9ad840) at /n/12/rask/src/all/gcc/optabs.c:4996 #8 0x00809590 in gen_reload (out=0x2ae99e9ddde0, in=0x2ae99e9ad840, opnum=1, type=RELOAD_FOR_INPUT) at /n/12/rask/src/all/gcc/reload1.c:8048 #9 0x0080b03f in do_input_reload (chain=value optimized out, rl=0xe9e028, j=1) at /n/12/rask/src/all/gcc/reload1.c:6995 #10 0x0080c546 in emit_reload_insns (chain=0x103c208) at /n/12/rask/src/all/gcc/reload1.c:7421 (gdb) frame 3 #3 0x00a4a369 in emit_pic_move (operands=0x7fff0d80e280, mode=value optimized out) at /n/12/rask/src/all/gcc/config/bfin/bfin.c:1964 1964operands[1] = legitimize_pic_address (operands[1], temp, (gdb) call debug_rtx(operands[0]) (reg:SI 2 R2) (gdb) call debug_rtx(operands[1]) (const:SI (plus:SI (symbol_ref:SI (_ZTVSt19basic_ostringstreamIcSt11char_traitsIcESaIcEE) [flags 0x40] var_decl 0x2ae99e5b1000 _ZTVSt19basic_ostringstreamIcSt11char_traitsIcESaIcEE) (const_int 12 [0xc]))) (gdb) frame 10 #10 0x0080c546 in emit_reload_insns (chain=0x103c208) at /n/12/rask/src/all/gcc/reload1.c:7421 7421 do_input_reload (chain, rld + j, j); (gdb) call debug_rtx (chain-insn) (insn 46 45 52 2 /home/rask/build/gcc-bfin-unknown-elf/bfin-unknown-elf/msep-data/libstdc++-v3/include/sstream:413 (set (mem/s/f/c:SI (plus:SI (reg/f:SI 15 FP) (const_int -208 [0xff30])) [4 __s.D.18269._vptr.basic_ostream+0 S4 A32]) (const:SI (plus:SI (symbol_ref:SI (_ZTVSt19basic_ostringstreamIcSt11char_traitsIcESaIcEE) [flags 0x40] var_decl 0x2ae99e5b1000 _ZTVSt19basic_ostringstreamIcSt11char_traitsIcESaIcEE) (const_int 12 [0xc] 14 {*movsi_insn} (nil)) (gdb) call debug_reload() Reload 0: reload_out (SI) = (mem/s/f/c:SI (plus:SI (reg/f:SI 15 FP) (const_int -208 [0xff30])) [4 __s.D.18269._vptr.basic_ostream+0 S4 A32]) DPREGS, RELOAD_FOR_OUTPUT (opnum = 0), optional reload_out_reg: (mem/s/f/c:SI (plus:SI (reg/f:SI 15 FP) (const_int -208 [0xff30])) [4 __s.D.18269._vptr.basic_ostream+0 S4 A32]) Reload 1: reload_in (SI) = (const:SI (plus:SI (symbol_ref:SI (_ZTVSt19basic_ostringstreamIcSt11char_traitsIcESaIcEE) [flags 0x40] var_decl 0x2ae99e5b1000 _ZTVSt19basic_ostringstreamIcSt11char_traitsIcESaIcEE) (const_int 12 [0xc]))) DPREGS, RELOAD_FOR_INPUT (opnum = 1), can't combine reload_in_reg: (reg/f:SI 242) reload_reg_rtx: (reg:SI 2 R2) -- Summary: [bfin]: ICE: in legitimize_pic_address, at config/bfin/bfin.c:325 Product: gcc Version: 4.3.0 Status: UNCONFIRMED Keywords: ice-on-valid-code, build Severity: normal Priority: P3 Component: target AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: rask at gcc dot gnu dot org GCC target triplet: bfin-unknown-elf http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34350
[Bug target/34350] [bfin]: ICE: in legitimize_pic_address, at config/bfin/bfin.c:325
--- Comment #1 from rask at gcc dot gnu dot org 2007-12-05 16:52 --- Created an attachment (id=14701) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14701action=view) testcase -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34350
[Bug target/33474] bfin: ICE: RTL check: expected code 'set' or 'clobber', have 'parallel' in bfin_adjust_cost, at config/bfin/bfin.c:3120
--- Comment #3 from rask at gcc dot gnu dot org 2007-12-03 17:34 --- Created an attachment (id=14695) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14695action=view) testcase This testcase fails with revision 130561. ./xgcc -B./ -O2 /tmp/hashtab.c -S -o /dev/null (gdb) call debug_rtx (dep_insn) (parallel [ (set (reg:PDI 33 A1) (mem/c:PDI (plus:SI (reg/f:SI 15 FP) (const_int -16 [0xfff0])) [23 S8 A32])) (clobber (reg:SI 0 R0)) ]) (gdb) call debug_rtx (pat) Variable pat is not available. -- rask at gcc dot gnu dot org changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |rask at gcc dot gnu dot org |dot org | Status|UNCONFIRMED |ASSIGNED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33474
[Bug rtl-optimization/34312] [4.3 regression] spill failure with -O2 -fPIC -march=pentium-m on i386
--- Comment #7 from rask at gcc dot gnu dot org 2007-12-03 17:52 --- Note that reload asks for Q_REGS when the constraints allow GENERAL_REGS, so the root cause is just reload being stupid. I bet it is this optimization from find_reloads() that causes it: if (! win ! did_match this_alternative[i] != (int) NO_REGS GET_MODE_SIZE (operand_mode[i]) = UNITS_PER_WORD reg_class_size [(int) preferred_class[i]] 0 ! SMALL_REGISTER_CLASS_P (preferred_class[i])) { if (! reg_class_subset_p (this_alternative[i], preferred_class[i])) { /* Since we don't have a way of forming the intersection, we just do something special if the preferred class is a subset of the class we have; that's the most common case anyway. */ if (reg_class_subset_p (preferred_class[i], this_alternative[i])) this_alternative[i] = (int) preferred_class[i]; else reject += (2 + 2 * pref_or_nothing[i]); } } Additionally, i386.h CLASS_LIKELY_SPILLED_P should probably take -mregparm and -fPIC etc. into account in deciding if Q_REGS is likely spilled or not. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34312
[Bug middle-end/34226] [4.3 Regression][frv] ICE in default_secondary_reload, at targhooks.c:612
--- Comment #6 from rask at gcc dot gnu dot org 2007-12-01 22:39 --- Created an attachment (id=14678) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14678action=view) patch v2 The first patch caused build failure on sh and arm, this one fixes that. -- rask at gcc dot gnu dot org changed: What|Removed |Added Attachment #14643|0 |1 is obsolete|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34226
[Bug rtl-optimization/3507] appalling optimisation with sub/cmp on multiple targets
--- Comment #9 from rask at gcc dot gnu dot org 2007-11-28 18:01 --- Created an attachment (id=14657) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14657action=view) Patch v2 to enhance cse.c This patch also handles unsigned comparisons and thus optimizes the original testcase too. Before: foo: movl4(%esp), %edx # 2 *movsi_1/1 [length = 4] movl8(%esp), %eax # 3 *movsi_1/1 [length = 4] movl%edx, %ecx # 35*movsi_1/1 [length = 2] subl%eax, %ecx # 7 *subsi_1/1 [length = 2] cmpl%eax, %edx # 8 *cmpsi_1_insn/1 [length = 2] jae .L2 # 9 *jcc_1 [length = 2] addl$100, %ecx # 11*addsi_1/1 [length = 3] .L2: movl%ecx, %eax # 18*movsi_1/1 [length = 2] ret # 38return_internal [length = 1] After: foo: movl4(%esp), %eax # 2 *movsi_1/1 [length = 4] subl8(%esp), %eax # 8 *subsi3_cc_overflow/2 [length = 4] jae .L2 # 9 *jcc_1 [length = 2] addl$100, %eax # 11*addsi_1/1 [length = 3] .L2: rep # 39return_internal_long[length = 1] ret I was going to abandon this patch, but maybe it deserves a second chance. :-) -- rask at gcc dot gnu dot org changed: What|Removed |Added Attachment #14647|0 |1 is obsolete|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=3507
[Bug target/11259] [avr] gcc Double 'andi' missed optimization
--- Comment #8 from rask at gcc dot gnu dot org 2007-11-28 22:07 --- Created an attachment (id=14658) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14658action=view) example patch This patch is an example of the suggestion in comment #6. When compiling with -S -dp, it is clear why the code isn't optimized: swap+andi is one instruction. test: /* prologue: function */ /* frame size = 0 */ in r24,50-0x20 ; 6 *movqi/4[length = 1] swap r24 ; 7 lshrqi3/5 [length = 2] andi r24,0x0f andi r24,lo8(12) ; 13 andqi3/2[length = 1] /* epilogue start */ ret ; 26 return [length = 1] The patch makes two instructions out of swap+andi of which andi is optimized away: test: /* prologue: function */ /* frame size = 0 */ in r24,50-0x20 ; 6 *movqi/4[length = 1] swap r24 ; 7 _rotlqi3_const4 [length = 1] andi r24,lo8(12) ; 14 andqi3/2[length = 1] /* epilogue start */ ret ; 27 return [length = 1] -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=11259
[Bug rtl-optimization/3507] appalling optimisation with sub/cmp on multiple targets
--- Comment #11 from rask at gcc dot gnu dot org 2007-11-29 00:09 --- In reply to comment #10 from Steven Bosscher 2007-11-28 22:02: + for (defs = DF_INSN_DEFS (insn); + *defs DF_REF_REGNO (*defs) != REGNO (x); + defs++) +; Are you aware of df_find_def() ? Not until now, and it won't work either, because it uses rtx_equal_p() and the mode of DF_REF_REG() doesn't match that of the REG rtx in the insn. See also URL:http://gcc.gnu.org/ml/gcc/2007-11/msg00719.html. IMNSHO, computing DEF-USE chains for this niche optimization loses in the cost/benefit trade-off. The split of the comparison setter and the comparison user across two insns with no link between them is an interesting case of poor infrastructure. Most back ends can't emit the setter before they know what the user looks like and therefore always emit the two back-to-back. At the same time, several passes need to find one from the other but can't rely on them to be back-to-back and because there's no link between them (except if by DEF-USE/USE-DEF), they have to roll their own means of doing so. (So yes, it's been done without DEF-USE before and it can be done without DEF-USE again.) I wonder if you can't just integrate this optimization in cse.c as-is by recording an equivalence a b == signof(c) when you process a - b. That doesn't catch the unsigned comparison. Actually, how would I even know if it is unsigned or not? In any case, adding this optimization to cse.c is Just Wrong (tm). I don't understand why you didn't even try to optimize this in one of the tree optimizers instead. I thought it was clear GCC is trying to reduce its dependency on RTL optimizations? There's no way to optimize the signed case with -fwrapv (e.g. Java) at the tree level, because we can't arrange for a - b to be computed in the same instruction as a b. At the RTL level, it is merely difficult. That makes it less interesting to work on this optimization at the tree level. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=3507
[Bug middle-end/34226] [4.3 Regression][frv] ICE in default_secondary_reload, at targhooks.c:612
-- rask at gcc dot gnu dot org changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |rask at gcc dot gnu dot org |dot org | Status|NEW |ASSIGNED Known to fail||4.3.0 Known to work||4.2.2 Last reconfirmed|2007-11-26 17:31:08 |2007-11-27 13:38:08 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34226
[Bug rtl-optimization/3507] appalling optimisation with sub/cmp on multiple targets
--- Comment #8 from rask at gcc dot gnu dot org 2007-11-27 18:04 --- Created an attachment (id=14647) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14647action=view) Patch to enhance cse.c The attached patch can optimize this testcase: void foo (int); int bar2 (int a, int b) { int c = a - b; if (a b) { foo (c); } return c; } Before: bar2: pushl %ebx# 33*pushsi2[length = 1] subl$8, %esp# 34pro_epilogue_adjust_stack_1/1 [length = 3] movl16(%esp), %edx # 2 *movsi_1/1 [length = 4] movl20(%esp), %eax # 3 *movsi_1/1 [length = 4] movl%edx, %ebx # 32*movsi_1/1 [length = 2] subl%eax, %ebx # 7 *subsi_1/1 [length = 2] cmpl%eax, %edx # 8 *cmpsi_1_insn/1 [length = 2] jge .L2 # 9 *jcc_1 [length = 2] movl%ebx, (%esp)# 11*movsi_1/2 [length = 3] callfoo # 12*call_0 [length = 5] .L2: movl%ebx, %eax # 19*movsi_1/1 [length = 2] addl$8, %esp# 37pro_epilogue_adjust_stack_1/1 [length = 3] popl%ebx# 38popsi1 [length = 1] ret # 39return_internal [length = 1] After: bar2: pushl %ebx# 33*pushsi2[length = 1] subl$8, %esp# 34pro_epilogue_adjust_stack_1/1 [length = 3] movl16(%esp), %eax # 2 *movsi_1/1 [length = 4] movl%eax, %ebx # 32*movsi_1/1 [length = 2] subl20(%esp), %ebx # 8 *subsi_2/2 [length = 4] jns .L2 # 9 *jcc_1 [length = 2] movl%ebx, (%esp)# 11*movsi_1/2 [length = 3] callfoo # 12*call_0 [length = 5] .L2: movl%ebx, %eax # 19*movsi_1/1 [length = 2] addl$8, %esp# 37pro_epilogue_adjust_stack_1/1 [length = 3] popl%ebx# 38popsi1 [length = 1] ret # 39return_internal [length = 1] One of the difficulties is that by the time the code gets to the cse1 pass, the statement c = a - b might have been moved below the a b test, making it harder to optimize. The bar2 testcase was crafted so this doesn't happen. But given the need to potentially replace both the sub/cmp insn and the jump insn, it would be better to move this optimization to combine. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=3507
[Bug target/34174] gcc produces erroneous asm for movdi
--- Comment #14 from rask at gcc dot gnu dot org 2007-11-28 01:44 --- Subject: Bug 34174 Author: rask Date: Wed Nov 28 01:44:10 2007 New Revision: 130489 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=130489 Log: Backport from mainline: 2007-11-26 Rask Ingemann Lambertsen [EMAIL PROTECTED] PR target/34174 * config/fr30/fr30.c (fr30_move_double): Sanitize mem-reg case. Copy the address before it is clobbered. testsuite/ * gcc.dg/pr34174-1.c: New. Added: branches/gcc-4_2-branch/gcc/testsuite/gcc.dg/pr34174-1.c - copied, changed from r130438, trunk/gcc/testsuite/gcc.dg/torture/pr34174-1.c Modified: branches/gcc-4_2-branch/gcc/ChangeLog branches/gcc-4_2-branch/gcc/config/fr30/fr30.c branches/gcc-4_2-branch/gcc/testsuite/ChangeLog -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34174
[Bug target/34174] gcc produces erroneous asm for movdi
--- Comment #15 from rask at gcc dot gnu dot org 2007-11-28 01:55 --- Fixed for both 4.2.3 and 4.3.0. -- rask at gcc dot gnu dot org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Known to fail|4.2.2 4.3.0 |4.2.2 Known to work||4.2.3 4.3.0 Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34174
[Bug middle-end/34226] [4.3 Regression][frv] ICE in default_secondary_reload, at targhooks.c:612
--- Comment #3 from rask at gcc dot gnu dot org 2007-11-26 17:27 --- This seems to have started with revision 130333, but I don't think that change is to blame: (gdb) bt #0 frv_secondary_reload_class (class=ICC_REGS, mode=BImode, x=0x2b0f0ec48d80, in_p=0) at /n/12/rask/src/all/gcc/config/frv/frv.c:6347 #1 0x0073430a in default_secondary_reload (in_p=0 '\0', x=0x2b0f0ec48d80, reload_class=GPR_REGS, reload_mode=BImode, sri=0x7fff9c430880) at /n/12/rask/src/all/gcc/targhooks.c:595 #2 0x006bca5c in secondary_reload_class (in_p=1 '\001', class=GPR_REGS, mode=VOIDmode, x=0x7) at /n/12/rask/src/all/gcc/reload.c:525 #3 0x006aa82f in init_regs () at /n/12/rask/src/all/gcc/regclass.c:1285 #4 0x00737032 in backend_init_target () at /n/12/rask/src/all/gcc/toplev.c:2040 #5 0x0073751a in toplev_main (argc=value optimized out, argv=value optimized out) at /n/12/rask/src/all/gcc/toplev.c:2086 #6 0x2b0f0ea1d4ca in __libc_start_main () from /lib/libc.so.6 #7 0x00403c2a in _start () at ../sysdeps/x86_64/elf/start.S:113 (gdb) fin Run till exit from #0 frv_secondary_reload_class (class=ICC_REGS, mode=BImode, x=0x2b0f0ec48d80, in_p=0) at /n/12/rask/src/all/gcc/config/frv/frv.c:6347 default_secondary_reload (in_p=0 '\0', x=0x2b0f0ec48d80, reload_class=GPR_REGS, reload_mode=BImode, sri=0x7fff9c430880) at /n/12/rask/src/all/gcc/targhooks.c:597 Value returned is $16 = GPR_REGS The next few lines of code read: if (class != NO_REGS) { enum insn_code icode = (in_p ? reload_in_optab[(int) reload_mode] : reload_out_optab[(int) reload_mode]); (gdb) print icode $19 = 0 (gdb) print insn_data[0]-name $20 = 0xb8e5ba *movqi_load (gdb) print insn_data[0]-n_operands $21 = 2 '\002' I.e. we get the wrong icode. (gdb) print reload_in_optab $22 = {0 repeats 45 times} (gdb) print reload_out $23 = {0 repeats 45 times} I would have expected a default of CODE_FOR_nothing, not 0. I think this is a problem with the new lazy optab initialization. We now call backend_init_target() and init_regs() before calling init_optabs(). -- rask at gcc dot gnu dot org changed: What|Removed |Added Component|target |middle-end http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34226
[Bug middle-end/34226] [4.3 Regression][frv] ICE in default_secondary_reload, at targhooks.c:612
--- Comment #4 from rask at gcc dot gnu dot org 2007-11-26 18:14 --- Created an attachment (id=14643) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14643action=view) patch v1 This patch makes the ICE go away. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34226
[Bug target/34174] gcc produces erroneous asm for movdi
--- Comment #13 from rask at gcc dot gnu dot org 2007-11-26 13:20 --- Subject: Bug 34174 Author: rask Date: Mon Nov 26 13:20:19 2007 New Revision: 130438 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=130438 Log: PR target/34174 * config/fr30/fr30.c (fr30_move_double): Sanitize mem-reg case. Copy the address before it is clobbered. testsuite/ * gcc.dg/torture/pr34174-1.c: New. Added: trunk/gcc/testsuite/gcc.dg/torture/pr34174-1.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/fr30/fr30.c trunk/gcc/testsuite/ChangeLog -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34174
[Bug target/34226] New: [4.3 Regression][frv] ICE in default_secondary_reload, at targhooks.c:612
$ echo 'int main (int argc, char *argv[]) { return 0; }' /tmp/test.c $ ./xgcc -B./ /tmp/test.c -S -o /dev/null /tmp/test.c:1: internal compiler error: in default_secondary_reload, at targhooks.c:612 Please submit a full bug report, ... This happens with revision 130402. Revision 129967 worked. Configure flags: --target frv-unknown-elf --enable-checking=yes,rtl --with-newlib --enable-sim --disable-gdb --disable-nls -- Summary: [4.3 Regression][frv] ICE in default_secondary_reload, at targhooks.c:612 Product: gcc Version: 4.3.0 Status: UNCONFIRMED Keywords: ice-on-valid-code, build Severity: normal Priority: P3 Component: target AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: rask at gcc dot gnu dot org GCC target triplet: frv-unknown-elf http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34226
[Bug rtl-optimization/6585] Redundant store/load instruction pairs on ix86
--- Comment #16 from rask at gcc dot gnu dot org 2007-11-23 20:43 --- (In reply to comment # I think the bug can be closed as fixed now. I'm not so convinced. This part leal(%ecx,%edx), %esi movl%esi, %edx movl4(%esp), %esi should have been addl%ecx, %edx movl4(%esp), %esi and using %esi instead of a stack slot doesn't fix the problem. Compare with what we started with (comment #0): movl%edx, 4(%esp) hi+a0*b1 ! Could be simplified addl%ecx, 4(%esp) hi+a0*b1+a1*b0! to a single insn: movl4(%esp), %edx ! addl %ecx, $edx The problem hasn't been fixed until we get that addl %ecx, %edx instruction. One more example: movl16(%esp), %eax - movl16(%esp), %esi movl20(%esp), %esi - movl20(%esp), %eax imull %ebx, %ecx imull %esi, %eax - imull %eax, %esi addl%eax, %ecx - addl%esi, %ecx movl%esi, %eax - - mull%ebx This is exactly what a register allocator is supposed to figure out: Register allocation. -- rask at gcc dot gnu dot org changed: What|Removed |Added Status|RESOLVED|REOPENED Resolution|FIXED | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=6585
[Bug target/34174] gcc produces erroneous asm for movdi
--- Comment #11 from rask at gcc dot gnu dot org 2007-11-23 13:46 --- dp-bit.c:964: internal compiler error: in change_address_1, at emit-rtl.c:1783 That is because it needs an offsetted address during and after reload, and the fr30 doesn't have that. It can happen on trunk too. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34174
[Bug target/34174] gcc produces erroneous asm for movdi
--- Comment #6 from rask at gcc dot gnu dot org 2007-11-22 12:31 --- Created an attachment (id=14605) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14605action=view) patch v1 for GCC 4.2.2 Here's a different patch which hopefully doesn't ICE on GCC 4.2.2. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34174
[Bug target/34174] gcc produces erroneous asm for movdi
--- Comment #8 from rask at gcc dot gnu dot org 2007-11-22 13:54 --- Created an attachment (id=14607) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14607action=view) patch v2 for GC 4.2.2 Weird. I don't understand why GCC 4.2.2 is having problems with that. This patch tries to fix fr30_move_double(), which has been broken ever since it was added seven and a half years ago. The instruction sequence now looks like this: ldi:32 a, r3 ; 13movsi_internal/4[length = 6] ldi:8 #248, r1; 52movsi_internal/1[length = 2] extsb r1 ; 53extendqisi2 [length = 2] addnfp, r1 ; 16addsi_regs [length = 2] mov r1, r2 ; 74movsi_internal/5[length = 2] ld @r1, r1 ; 75movsi_internal/7[length = 2] addn4, r2 ; 76addsi_small_int/1 [length = 2] ld @r2, r2 ; 77movsi_internal/7[length = 2] st r1, @r3 ; 78movsi_internal/6[length = 2] mov r3, r0 ; 79movsi_internal/5[length = 2] addn4, r0 ; 80addsi_small_int/1 [length = 2] st r2, @r0 ; 81movsi_internal/6[length = 2] -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34174
[Bug target/34174] gcc produces erroneous asm for movdi
--- Comment #10 from rask at gcc dot gnu dot org 2007-11-23 01:33 --- I think both branches of if (reverse) could use the exact same code, i.e. this whole reverse/!reverse idea is bogus on fr30. Suppose our output registers are r1 and r2 and we receive the address in rN. Then, for any N, this instruction sequence should work: ... mov rN, r2 ld @rN, r1 addn 4, r2 ld @r2, r2 ... I found an fr30 simulator in GDB 5.2, so I'm testing the patch for GCC 4.3. Because this bug is not a regression, it might not be fixed in GCC 4.2.x. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34174
[Bug bootstrap/32212] Makefile:142: ../.././gcc/libgcc.mvars: No such file or directory
--- Comment #3 from rask at gcc dot gnu dot org 2007-11-21 14:44 --- Strictly speaking, it is a bug that building in the source tree doesn't work, but IIRC, the instructions on building GCC do mention that building in the source tree doesn't work, and no fix seem likely any time soon. -- rask at gcc dot gnu dot org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||WONTFIX http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32212
[Bug target/34174] gcc produces erroneous asm for movdi
--- Comment #3 from rask at gcc dot gnu dot org 2007-11-21 15:40 --- Created an attachment (id=14592) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14592action=view) patch v1 for GCC 4.3 Please use -dp when posting asm output because it makes it easier to see what is going on. Please also state the compiler options needed to reproduce the bug. Anyway, confirmed on trunk with revision 130319 with -O0. It is the usual problem of a target which defines movdi patterns when it shouldn't and the fix is to just delete the crap, which the attached patch does. It even saves an instruction. Before (function foo): ... ldi:32 a, r3 ; 13movsi_internal/4[length = 6] ldi:8 #248, r1; 52movsi_internal/1[length = 2] extsb r1 ; 53extendqisi2 [length = 2] addnfp, r1 ; 16addsi_regs [length = 2] ld @r1, r1 ; 74movsi_internal/7[length = 2] mov r1, r2 ; 75movsi_internal/5[length = 2] addn4, r2 ; 76addsi_small_int/1 [length = 2] ld @r2, r2 ; 77movsi_internal/7[length = 2] st r1, @r3 ; 78movsi_internal/6[length = 2] mov r3, r0 ; 79movsi_internal/5[length = 2] addn4, r0 ; 80addsi_small_int/1 [length = 2] st r2, @r0 ; 81movsi_internal/6[length = 2] ... After: ldi:32 a, r3 ; 19movsi_internal/4[length = 6] ldi:8 #248, r1; 77movsi_internal/1[length = 2] extsb r1 ; 78extendqisi2 [length = 2] addnfp, r1 ; 22addsi_regs [length = 2] ld @r1, r2 ; 23movsi_internal/7[length = 2] st r2, @r3 ; 24movsi_internal/6[length = 2] mov r3, r2 ; 82movsi_internal/5[length = 2] addn4, r2 ; 26addsi_small_int/1 [length = 2] addn4, r1 ; 28addsi_small_int/1 [length = 2] ld @r1, r1 ; 29movsi_internal/7[length = 2] st r1, @r2 ; 30movsi_internal/6[length = 2] Please try this patch with GCC 4.2.2. Also, do you happen to know of a simulator for fr30? -- rask at gcc dot gnu dot org changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |rask at gcc dot gnu dot org |dot org | Status|UNCONFIRMED |ASSIGNED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34174
[Bug middle-end/29749] [4.0/4.1/4.2/4.3 regression] Missing byte swap optimizations
--- Comment #14 from rask at gcc dot gnu dot org 2007-11-21 19:11 --- it is preferrable to avoid including any headers in testcases if possible. Yes, but IMHO not at the cost of disabling the test on targets where it is supposed to run and pass. Targets without stdint.h are rare and this sort of optimization really does need to be tested on the 8-bit and 16-bit targets too. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29749
[Bug other/31088] Building cross-compiler with newlib-1.15.0 and binutils-2.17 fails in libssp
--- Comment #1 from rask at gcc dot gnu dot org 2007-11-20 19:30 --- If you don't build in a combined tree, you'll have to build and install binutils and newlib before you can build some of the gcc components, such as libssp. This means building gcc twice; first with --disable-libssp and so on, then build and install newlib and finally build gcc again without --disable-libssp and so on. IOW using a combined tree is much easier, but see also bug 32154, which was only fixed for GCC 4.3. -- rask at gcc dot gnu dot org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Component|c |other Resolution||INVALID http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31088
[Bug bootstrap/32287] gas version style changed causes warnings with configure
--- Comment #7 from rask at gcc dot gnu dot org 2007-11-20 19:46 --- *** Bug 34165 has been marked as a duplicate of this bug. *** -- rask at gcc dot gnu dot org changed: What|Removed |Added CC||warren dot l dot dodge at ||tektronix dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32287
[Bug bootstrap/34165] gcc directory fails to configure on solaris 8
--- Comment #1 from rask at gcc dot gnu dot org 2007-11-20 19:46 --- *** This bug has been marked as a duplicate of 32287 *** -- rask at gcc dot gnu dot org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||DUPLICATE http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34165
[Bug middle-end/29749] [4.0/4.1/4.2/4.3 regression] Missing byte swap optimizations
--- Comment #11 from rask at gcc dot gnu dot org 2007-11-20 20:04 --- /* { dg-do run { target { int32plus } } } */ Or even better, use types such as uint_least32_t from stdint.h, then use this dg directive: /* { dg-do run { target { stdint_types } } } */ -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29749
[Bug testsuite/28870] [4.2/4.3 Regression] configuring, over-riding timeout values in testsuite
--- Comment #16 from rask at gcc dot gnu dot org 2007-11-17 12:19 --- A directive which allows a test to increase the timeout to x times the normal timeout would probably be a good idea. A few of the tests take much longer than most and IMO their timeout should be set based on the default timeout, which we assume the user sets to something appropriate for the machine used for testing. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28870
[Bug target/8603] [Alpha] s?addl pattern doesn't work
--- Comment #4 from rask at gcc dot gnu dot org 2007-11-14 19:10 --- For f(), combine wants a pattern to match (set (reg:DI 76) (sign_extend:DI (subreg:SI (plus:DI (subreg:DI (mult:SI (reg:SI 16 $16 [ x ]) (const_int 4 [0x4])) 0) (reg:DI 17 $17 [ y ])) 0))) but the closest one is (define_insn *saddl_se [(set (match_operand:DI 0 register_operand =r,r) (sign_extend:DI (plus:SI (mult:SI (match_operand:SI 1 reg_not_elim_operand r,r) (match_operand:SI 2 const48_operand I,I)) (match_operand:SI 3 sext_add_operand rI,O] and similarily for g() where the *ssubl_se pattern doesn't match. I wonder where the (subreg:DI (mult:SI ...)) part comes from. That can't be right. -- rask at gcc dot gnu dot org changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |rask at gcc dot gnu dot org |dot org | Status|NEW |ASSIGNED Known to fail||4.3.0 Last reconfirmed|2005-09-07 17:37:27 |2007-11-14 19:10:09 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=8603
[Bug rtl-optimization/34072] unoptimal byte extraction.
--- Comment #1 from rask at gcc dot gnu dot org 2007-11-14 01:44 --- With -S -dp it is clear that only byte0 is optimized: byte0: movzbl 4(%esp), %eax # 11*movqi_1/3 byte1: movl4(%esp), %eax # 24*movsi_1/1 movl8(%esp), %edx # 25*movsi_1/1 shrdl $8, %edx, %eax # 30x86_shrd_1/1 byte6: movzwl 10(%esp), %eax # 24*zero_extendhisi2_movzwl byte7: movzbl 11(%esp), %eax # 28*zero_extendqisi2_movzbw They should all be optimized to use movqi. The first part of the problem is that any of cse, cse2, gcse and fwprop will combine these instructions (insn 7 6 8 2 /tmp/pr34072.c:3 (set (reg:QI 60) (subreg:QI (reg:SI 64) 0)) 62 {*movqi_1} (nil)) (insn 8 7 12 2 /tmp/pr34072.c:3 (set (reg:QI 58 [ result ]) (reg:QI 60)) 62 {*movqi_1} (nil)) (insn 12 8 18 2 /tmp/pr34072.c:3 (set (reg/i:QI 0 ax) (reg:QI 58 [ result ])) 62 {*movqi_1} (nil)) into (insn 12 8 18 2 /tmp/pr34072.c:3 (set (reg/i:QI 0 ax [ result ]) (subreg:QI (reg:SI 64) 0)) 62 {*movqi_1} (nil)) and then combine won't touch it because of the hard register (ax) and SMALL_REGISTER_CLASSES and/or CLASS_LIKELY_SPILLED. The fix is to teach these passes to not combine these insns, as demonstrated using -fno-forward-propagate -fno-gcse -fno-rerun-cse-after-loop -fno-cse[1]: byte6: movzbl 10(%esp), %eax # 8 *movqi_1/3 byte7: movzbl 11(%esp), %eax # 8 *movqi_1/3 Byte1 is still not optimized because we're failing to simplify this instruction in combine: (set (reg:QI 60) (subreg:QI (lshiftrt:DI (mem/c/i:DI (reg/f:SI 16 argp) [2 x+0 S8 A32]) (const_int 8 [0x8])) 0)) I should be entirely possible to simplify it to this: (set (reg:QI 60) (mem/c/i:QI (plus:SI (reg/f:SI 16 argp) (const_int 1 [1] An option I hacked in to debug this problem. -- rask at gcc dot gnu dot org changed: What|Removed |Added Status|UNCONFIRMED |NEW Component|target |rtl-optimization Ever Confirmed|0 |1 Keywords||missed-optimization Known to fail||4.3.0 Last reconfirmed|-00-00 00:00:00 |2007-11-14 01:44:03 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34072
[Bug middle-end/34040] New: [4.3 Regression] ICE: in simplify_subreg, at simplify-rtx.c:4921 building libgfortran
Build failure in libgfortran with revision 129967 (128328 worked): /home/rask/build/gcc-sh-unknown-elf/./gcc/gfortran -B/home/rask/build/gcc-sh-unknown-elf/./gcc/ -nostdinc -B/home/rask/build/gcc-sh-unknown-elf/sh-unknown-elf/m2e/newlib/ -isystem /home/rask/build/gcc-sh-unknown-elf/sh-unknown-elf/m2e/newlib/targ-include -isystem /n/12/rask/src/all/newlib/libc/include -B/usr/local/sh-unknown-elf/bin/ -B/usr/local/sh-unknown-elf/lib/ -isystem /usr/local/sh-unknown-elf/include -isystem /usr/local/sh-unknown-elf/sys-include -L/home/rask/build/gcc-sh-unknown-elf/./ld -m2e -DHAVE_CONFIG_H -I. -I/n/12/rask/src/all/libgfortran -I. -iquote/n/12/rask/src/all/libgfortran/io -I/n/12/rask/src/all/libgfortran/../gcc -I/n/12/rask/src/all/libgfortran/../gcc/config -I../../.././gcc -D_GNU_SOURCE -I . -Wall -fno-repack-arrays -fno-underscoring -fallow-leading-underscore -g -O2 -m2e -c /n/12/rask/src/all/libgfortran/generated/_sign_r8.F90 -o _sign_r8.o /n/12/rask/src/all/libgfortran/generated/_sign_r8.F90: In function '_gfortran_specific__sign_r8': /n/12/rask/src/all/libgfortran/generated/_sign_r8.F90:46: internal compiler error: in simplify_subreg, at simplify-rtx.c:4921 (gdb) call debug_rtx (target) (reg:SF 159 [ D.489 ]) (gdb) frame 6 #6 0x0069c488 in expand_copysign (op0=0x2ae08e34bba0, op1=0x2ae08e34ba80, target=0x1) at /n/12/rask/src/all/gcc/optabs.c:3621 3621 rtx targ_piece = operand_subword (target, i, 1, mode); (gdb) call debug_rtx (op0) (mem:DF (reg/v/f:SI 161 [ p1 ]) [2 (* p1) S8 A32]) (gdb) call debug_rtx (op1) (mem:DF (reg/v/f:SI 162 [ p2 ]) [2 (* p2) S8 A32]) (gdb) print mode $4 = DFmode Notice the mismatching modes: mode = DFmode with SFmode target. (gdb) frame 4 #4 0x0076a6c9 in simplify_subreg (outermode=SImode, op=0x2ae08e34dfa0, innermode=DFmode, byte=0) at /n/12/rask/src/all/gcc/simplify-rtx.c:4920 4920 gcc_assert (GET_MODE (op) == innermode (gdb) call debug_rtx(op) (reg:SF 159 [ D.489 ]) Command line: ./gcc/f951 _sign_r8.f95 -ffree-form -quiet -dumpbase _sign_r8.F90 -m2e -m2e -auxbase-strip _sign_r8.o -g -O2 -Wall -fno-repack-arrays -fno-underscoring -fallow-leading-underscore -I. -I/n/12/rask/src/all/libgfortran -I. -I/n/12/rask/src/all/libgfortran/../gcc -I/n/12/rask/src/all/libgfortran/../gcc/config -I../../.././gcc -I . -fpreprocessed -o /dev/null Configure flags: --target sh-unknown-elf --enable-checking=yes,rtl --with-newlib --enable-sim --disable-gdb --disable-nls -- Summary: [4.3 Regression] ICE: in simplify_subreg, at simplify- rtx.c:4921 building libgfortran Product: gcc Version: 4.3.0 Status: UNCONFIRMED Keywords: ice-on-valid-code, build Severity: normal Priority: P3 Component: middle-end AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: rask at gcc dot gnu dot org GCC target triplet: sh-unknown-elf http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34040
[Bug middle-end/34040] [4.3 Regression] ICE: in simplify_subreg, at simplify-rtx.c:4921 building libgfortran
--- Comment #1 from rask at gcc dot gnu dot org 2007-11-09 10:45 --- Created an attachment (id=14513) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14513action=view) test case -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34040
[Bug target/30829] extra register zero extends
--- Comment #2 from rask at gcc dot gnu dot org 2007-11-09 11:08 --- It's not unusual to need more than one instruction pattern for the same machine instruction. See URL:http://gcc.gnu.org/ml/gcc-patches/2007-10/msg01318.html and the followup for a recent example and what you can do about it. -- rask at gcc dot gnu dot org changed: What|Removed |Added Keywords||missed-optimization http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30829
[Bug target/33474] bfin: ICE: RTL check: expected code 'set' or 'clobber', have 'parallel' in bfin_adjust_cost, at config/bfin/bfin.c:3120
--- Comment #1 from rask at gcc dot gnu dot org 2007-11-09 11:54 --- Created an attachment (id=14514) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14514action=view) patch v1 -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33474
[Bug target/33551] ICE: RTL check: expected elt 0 type 'i' or 'n', have 'w' (rtx const_int) in m32c_immd_dbl_mov, at config/m32c/m32c.c:3010
--- Comment #2 from rask at gcc dot gnu dot org 2007-11-09 11:58 --- Tested revision 129966 which works. -- rask at gcc dot gnu dot org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Known to work||4.3.0 Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33551
[Bug rtl-optimization/21150] Suboptimal byte extraction from 64bits
--- Comment #4 from rask at gcc dot gnu dot org 2007-11-09 19:48 --- I think this might be a middle-end issue related to PR 7061 or PR 15184. We're doing slightly better with GCC 4.3.0 (because of subreg lowering, I guess), but not much (asm output with -dp for readability): a: movlv+44, %eax # 53*movsi_1/1 [length = 5] movlv+8, %edx # 23*movsi_1/1 [length = 6] shrl$8, %eax# 54*lshrsi3_1/1[length = 3] xorbv+36, %al # 11*xorqi_1/1 [length = 6] xorbv, %al # 13*xorqi_1/1 [length = 6] xorbv+54, %al # 17*xorqi_1/1 [length = 6] xorbv+63, %al # 21*xorqi_1/1 [length = 6] shrl$8, %edx# 24*lshrsi3_1/1[length = 3] xorl%edx, %eax # 66*xorsi_1/1 [length = 2] xorbv+18, %al # 29*xorqi_1/1 [length = 6] xorbv+27, %al # 33*xorqi_1/1 [length = 6] ret # 69return_internal [length = 1] b: pushl %ebx# 75*pushsi2[length = 1] movlv+20, %edx # 69*movsi_1/1 [length = 6] movlv+12, %ebx # 67*movsi_1/1 [length = 6] movlv+8, %ecx # 66*movsi_1/1 [length = 6] movlv+16, %eax # 68*movsi_1/1 [length = 5] shrdl $8, %ebx, %ecx # 81x86_shrd_1/1[length = 4] shrdl $16, %edx, %eax # 83x86_shrd_1/1[length = 4] movlv+24, %edx # 71*movsi_1/1 [length = 6] xorl%ecx, %eax # 70*xorsi_1/1 [length = 2] movlv+28, %ecx # 72*movsi_1/1 [length = 6] xorbv, %al # 13*xorqi_1/1 [length = 6] popl%ebx# 78popsi1 [length = 1] shrdl $24, %ecx, %edx # 85x86_shrd_1/1[length = 4] xorl%edx, %eax # 73*xorsi_1/1 [length = 2] movlv+44, %edx # 53*movsi_1/1 [length = 6] xorbv+36, %al # 21*xorqi_1/1 [length = 6] shrl$8, %edx# 54*lshrsi3_1/1[length = 3] xorl%edx, %eax # 74*xorsi_1/1 [length = 2] xorbv+54, %al # 29*xorqi_1/1 [length = 6] xorbv+63, %al # 33*xorqi_1/1 [length = 6] ret # 79return_internal [length = 1] c: movzbl v+9, %eax # 7 *movqi_1/3 [length = 7] xorbv+18, %al # 8 *xorqi_1/1 [length = 6] xorbv, %al # 9 *xorqi_1/1 [length = 6] xorbv+27, %al # 10*xorqi_1/1 [length = 6] xorbv+36, %al # 11*xorqi_1/1 [length = 6] xorbv+45, %al # 12*xorqi_1/1 [length = 6] xorbv+54, %al # 13*xorqi_1/1 [length = 6] xorbv+63, %al # 14*xorqi_1/1 [length = 6] ret # 33return_internal [length = 1] d: pushl %ebx# 75*pushsi2[length = 1] movlv+20, %edx # 69*movsi_1/1 [length = 6] movlv+12, %ebx # 67*movsi_1/1 [length = 6] movlv+8, %ecx # 66*movsi_1/1 [length = 6] movlv+16, %eax # 68*movsi_1/1 [length = 5] shrdl $8, %ebx, %ecx # 81x86_shrd_1/1[length = 4] shrdl $16, %edx, %eax # 83x86_shrd_1/1[length = 4] movlv+24, %edx # 71*movsi_1/1 [length = 6] xorl%ecx, %eax # 70*xorsi_1/1 [length = 2] movlv+28, %ecx # 72*movsi_1/1 [length = 6] xorbv, %al # 13*xorqi_1/1 [length = 6] popl%ebx# 78popsi1 [length = 1] shrdl $24, %ecx, %edx # 85x86_shrd_1/1[length = 4] xorl%edx, %eax # 73*xorsi_1/1 [length = 2] movlv+44, %edx # 53*movsi_1/1 [length = 6] xorbv+36, %al # 21*xorqi_1/1 [length = 6] shrl$8, %edx# 54*lshrsi3_1/1[length = 3] xorl%edx, %eax # 74*xorsi_1/1 [length = 2] xorbv+54, %al # 29*xorqi_1/1 [length = 6] xorbv+63, %al # 33*xorqi_1/1 [length = 6] ret # 79return_internal [length = 1] .ident GCC: (GNU) 4.3.0 20071102 (experimental) -- rask at gcc dot gnu dot org changed: What|Removed |Added Known to fail||4.3.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21150
[Bug rtl-optimization/18560] better optimalization of EOR/MOV block.
--- Comment #7 from rask at gcc dot gnu dot org 2007-11-09 20:40 --- This has been fixed for more than a year: reverse: @ args = 0, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. eor r3, r0, r0, ror #16 @ 12*arith_shiftsi [length = 4] bic r3, r3, #16711680 @ 13*arm_andsi3_insn/2 [length = 4] mov r0, r0, ror #8 @ 15*arm_shiftsi3 [length = 4] eor r0, r0, r3, lsr #8 @ 23*arith_shiftsi [length = 4] @ lr needed for prologue@ 36prologue_use[length = 4] bx lr @ 39return [length = 12] .size reverse, .-reverse .ident GCC: (GNU) 4.2.0 20060729 (experimental) 4.3.0 (revision 129967) generates the same code. -- rask at gcc dot gnu dot org changed: What|Removed |Added Status|NEW |RESOLVED Known to work||4.2.0 4.3.0 Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18560
[Bug rtl-optimization/11873] inefficient use of registers induces size and time overhead
--- Comment #4 from rask at gcc dot gnu dot org 2007-11-09 23:51 --- This has improved (-O2 -fomit-frame-pointer): test: movl4(%esp), %eax # 32*movsi_1/1 [length = 4] movl8(%esp), %edx # 44*movsi_1/1 [length = 4] orl %eax, %edx # 6 *iorsi_1/1 [length = 2] addl$1, %eax# 35*addsi_1/1 [length = 3] cmpl$1, %edx# 38*cmpsi_1_insn/1 [length = 3] sbbl%edx, %edx # 39x86_movsicc_0_m1[length = 2] notl%edx# 40*one_cmplsi2_1 [length = 2] andl%edx, %eax # 41*andsi_1/1 [length = 2] ret # 47return_internal [length = 1] .ident GCC: (GNU) 4.3.0 20071102 (experimental) With -Os -fomit-frame-pointer we get: test: movl4(%esp), %edx # 32*movsi_1/1 [length = 4] xorl%eax, %eax # 48*movsi_xor [length = 2] movl8(%esp), %ecx # 43*movsi_1/1 [length = 4] orl %edx, %ecx # 7 *iorsi_3[length = 2] je .L3 # 8 *jcc_1 [length = 2] leal1(%edx), %eax # 44*lea_1 [length = 3] .L3: ret # 47return_internal [length = 1] With -O2/-Os -fomit-frame-pointer -march=pentiumpro: test: movl4(%esp), %edx # 32*movsi_1/1 [length = 4] xorl%eax, %eax # 46*movsi_xor [length = 2] leal1(%edx), %ecx # 41*lea_1 [length = 3] orl 8(%esp), %edx # 36*iorsi_3[length = 4] cmovne %ecx, %eax # 38*movsicc_noc/1 [length = 3] ret # 44return_internal [length = 1] I would probably code it like so: movl4(%esp), %eax ; 4 movl8(%esp), %edx ; 4 orl %eax, %edx; 2 addl$-1,%edx; 3 adcl$0, %eax; 3 ret ; 1 -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=11873
[Bug rtl-optimization/15792] missed subreg optimization
--- Comment #10 from rask at gcc dot gnu dot org 2007-11-10 00:15 --- This was fixed in 4.3.0. -- rask at gcc dot gnu dot org changed: What|Removed |Added Status|NEW |RESOLVED Keywords||ra Known to fail||4.1.2 4.2.0 4.2.1 4.2.2 Known to work||4.3.0 Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15792
[Bug rtl-optimization/23813] [4.3 Regression] redundant register assignments not eliminated
--- Comment #4 from rask at gcc dot gnu dot org 2007-11-10 00:59 --- We are regressing! Number of asm lines, not counting directives: 4.1.2: 88 4.2.0: 80 4.2.1: 80 4.2.2: 80 4.3.0: 100 (revision 129967) -- rask at gcc dot gnu dot org changed: What|Removed |Added Summary|redundant register |[4.3 Regression] redundant |assignments not eliminated |register assignments not ||eliminated http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23813
[Bug target/33474] bfin: ICE: RTL check: expected code 'set' or 'clobber', have 'parallel' in bfin_adjust_cost, at config/bfin/bfin.c:3120
--- Comment #2 from rask at gcc dot gnu dot org 2007-11-10 01:21 --- I can't reproduce this with revision 127331 or 129967. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33474
[Bug target/30315] optimize unsigned-add overflow test on x86 to use cpu flags from addl
--- Comment #16 from rask at gcc dot gnu dot org 2007-11-10 01:32 --- Two testcases which aren't optimized: unsigned int bad1 (unsigned int a) { unsigned int c = a - 1; if (c a) abort (); else return c; } unsigned int bad2 (unsigned int a) { unsigned int c = a - 2; if (c a) abort (); else return c; } See also URL:http://gcc.gnu.org/ml/gcc-patches/2007-10/msg01359.html. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30315
[Bug target/30801] [4.3 Regression] performance regression on uint64_t operations
--- Comment #4 from rask at gcc dot gnu dot org 2007-11-07 16:35 --- Francois-Xavier, do you still see a performance regression? If so, please post asm output (-S -dp) from both versions? -- rask at gcc dot gnu dot org changed: What|Removed |Added CC||rask at gcc dot gnu dot org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30801
[Bug target/32787] [4.2/4.3 Regression] Sun Studio 12 Undefined symbol addl
--- Comment #10 from rask at gcc dot gnu dot org 2007-11-06 20:14 --- Subject: Bug 32787 Author: rask Date: Tue Nov 6 20:14:22 2007 New Revision: 129944 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=129944 Log: 2007-11-06 Rask Ingemann Lambertsen [EMAIL PROTECTED] PR target/32787 * config/i386/driver-i386.c: Test for __GNUC__ instead of GCC_VERSION which is always defined. Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/driver-i386.c -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32787
[Bug rtl-optimization/3507] appalling optimisation with sub/cmp on multiple targets
-- rask at gcc dot gnu dot org changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |rask at gcc dot gnu dot org |dot org | Status|NEW |ASSIGNED Last reconfirmed|2007-08-23 20:10:25 |2007-11-02 14:17:51 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=3507
[Bug target/31507] [4.3 Regression] libffi regression, many.c, closure_fn2/fn3.c with -Os
--- Comment #10 from rask at gcc dot gnu dot org 2007-10-31 14:44 --- Oops, I'm sorry about stealing your bug, Jakub. I didn't see you had taken it. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31507
[Bug target/31507] [4.3 Regression] libffi regression, many.c, closure_fn2/fn3.c with -Os
--- Comment #9 from rask at gcc dot gnu dot org 2007-10-31 14:37 --- Created an attachment (id=14448) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14448action=view) patch for testing This seems to be a simple mismatch between what push_operand() accepts and what matches the '' constraint, so reload thinks the destination doesn't match any of the alternatives and doesn't know how to fix this sort of operand. Please test this patch on x86_64-apple-darwin8. -- rask at gcc dot gnu dot org changed: What|Removed |Added AssignedTo|jakub at gcc dot gnu dot org|rask at gcc dot gnu dot org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31507
[Bug target/31507] [4.3 Regression] libffi regression, many.c, closure_fn2/fn3.c with -Os
--- Comment #12 from rask at gcc dot gnu dot org 2007-10-31 15:00 --- That's IMHO wrong, you are changing the meaning of constraint. Yes, I see what you mean, they ('' and '') are defined independently of stack direction. They should however accept PRE_MODIFY and POST_MODIFY. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31507
[Bug target/31507] [4.3 Regression] libffi regression, many.c, closure_fn2/fn3.c with -Os
--- Comment #5 from rask at gcc dot gnu dot org 2007-10-29 15:11 --- Is it possible to reproduce this on x86_64-unknown-linux-gnu somehow? I've tried to no avail with -Os and -mregparm=0 -Os on both testcases. What is a known bad revision? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31507
[Bug target/12081] Gcc can't be compiled with -mregparm=3
--- Comment #17 from rask at gcc dot gnu dot org 2007-10-30 00:14 --- Created an attachment (id=14438) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14438action=view) patch v3, varargs free In reply to comment #16: You can cast them at the time of calling and store them as void * in the table --- that is standard-compliant. Casting is what caused the problem to go unnoticed in the first place. This version will cause complaints from the compiler if there is a mixup. -- rask at gcc dot gnu dot org changed: What|Removed |Added Attachment #14419|0 |1 is obsolete|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12081
[Bug target/12081] Gcc can't be compiled with -mregparm=3
--- Comment #10 from rask at gcc dot gnu dot org 2007-10-28 11:14 --- Created an attachment (id=14419) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14419action=view) patch v2, i386 fix added -- rask at gcc dot gnu dot org changed: What|Removed |Added Attachment #14417|0 |1 is obsolete|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12081
[Bug other/29442] insn-attrtab has grown too large
--- Comment #7 from rask at gcc dot gnu dot org 2007-10-28 12:40 --- Created an attachment (id=14420) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14420action=view) Reenable alloca() on non-GCC compilers The memory fragmentation problem is to be caused by libiberty which disables alloca() if you bootstrap with a non-GCC compiler[1] (which I do). Instead you get a malloc() based replacment. The df_* functions use alloca() and are called a lot. With this patch on top of the next one, process size tops at around 185 MB compared to the 900+ MB it reached before, at which point I gave up with 256 MB of RAM. Just wondering, which compiler are you using to bootstrap GCC? [1] http://gcc.gnu.org/ml/gcc-patches/2001-03/msg00194.html -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29442
[Bug other/29442] insn-attrtab has grown too large
--- Comment #8 from rask at gcc dot gnu dot org 2007-10-28 12:57 --- Created an attachment (id=14421) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14421action=view) split insn-attrtab.c into three files Here's the patch to split insn-attrtab.c into smaller pieces. The result: $ wc -l insn-*tab.c 14566 insn-attrtab.c 44454 insn-dfatab.c 36815 insn-latencytab.c 95835 total Please tell me if it makes a difference. (It is necessary to run autoheader and autoconf in the gcc directory after applying the patches.) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29442
[Bug target/12081] Gcc can't be compiled with -mregparm=3
--- Comment #12 from rask at gcc dot gnu dot org 2007-10-28 14:00 --- In reply to comment #11: How many times do I have to say this is bad for most RISC targets (hosts)? I don't particularily care how many times you say it. Show some code (which works) and/or show some timings (of code that works). Really if the type being is used is wrong, they should be changed rather than changing to use var-args. Will you please look at the code? You have gen_movsi() with 2 arguments and gen_addsi3() with 3 arguments. Given that they are put into a table as a function of type insn_gen_fn and called as such through the table (see the GEN_FCN() macro in optabs.h and the table definition in recog.h), the function definition has to match. If it doesn't, all bets are off as to what arguments the function receives. We have gen_* functions taking 0, 1, 2, 3, ... arguments and with GCC being designed the way it is, they need to be prototyped and defined with the same arguments. You are most definitely welcome to post some non-varargs code which works. You are most definitely also welcome to time it against my patches. -- rask at gcc dot gnu dot org changed: What|Removed |Added Known to fail||4.3.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12081
[Bug target/12081] Gcc can't be compiled with -mregparm=3
--- Comment #13 from rask at gcc dot gnu dot org 2007-10-28 14:02 --- In reply to comment #7: I need it to build GCC with OpenWatcom, which wants parameters on the stack by default. Er, that's in registers by default, of course. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12081
[Bug target/12081] Gcc can't be compiled with -mregparm=3
--- Comment #15 from rask at gcc dot gnu dot org 2007-10-28 15:54 --- In reply to comment #14: is your patch supposed to help with testcase presented in comment #6? No, it's aimed at the problem from the description. That is, GCC itself doesn't work if compiled with a compiler where f(int a, int b) is called differently than f(int a, ...). I don't think your testcase is actually supposed to work. Don't you get a warning if you take out (void *) and compile with -Wall? As pinpointed in comment #2, GCC contains the same incorrect type cast. I actually forgot to patch this part: Index: gcc/genoutput.c === --- gcc/genoutput.c (revision 129503) +++ gcc/genoutput.c (working copy) @@ -386,7 +386,7 @@ output_insn_data (void) } if (d-name d-name[0] != '*') - printf ((insn_gen_fn) gen_%s,\n, d-name); + printf (gen_%s,\n, d-name); else printf (0,\n); -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12081
[Bug other/29442] insn-attrtab has grown too large
--- Comment #9 from rask at gcc dot gnu dot org 2007-10-28 17:48 --- I just tried with only the alloca patch, and despite the unsplit insn-attrtab.c file, process size tops at just 205 MB. It looks like GCC 4.3.0 is in a much better shape than GCC 4.1.1, so I'm letting go of the bug. Just to clarify, the process sizes where taken when the stage1 compiler was compiling the stage2 compiler. -- rask at gcc dot gnu dot org changed: What|Removed |Added AssignedTo|rask at gcc dot gnu dot org |unassigned at gcc dot gnu ||dot org Status|ASSIGNED|NEW Known to fail|4.3.0 | Known to work||4.3.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29442
[Bug other/29442] insn-attrtab has grown too large
--- Comment #6 from rask at gcc dot gnu dot org 2007-10-27 12:46 --- As far as I can tell (from running cc1 in a debugger), the problem is not so much the size of the file, but that it contains two large functions and GCC leaks memory. After compiling the first large function, the process size is 886 megs (basic-block reordering alone being responsible for about 200 megs), probably littered with pieces of garbage. That'll cause all following allocations to become scattered thoughout memory and the swapfest begins if you don't have lots of memory. 256 megs is not enough. The RTL passes in particular seem to be affected and several df_* functions take ages to run. I'm working on a patch to split insn-attrtab.c into more files. -- rask at gcc dot gnu dot org changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |rask at gcc dot gnu dot org |dot org | Status|UNCONFIRMED |ASSIGNED Ever Confirmed|0 |1 Known to fail||4.3.0 Last reconfirmed|-00-00 00:00:00 |2007-10-27 12:46:05 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29442
[Bug target/18353] ICE with movaps in inline asm when using -masm=intel
--- Comment #4 from rask at gcc dot gnu dot org 2007-10-27 13:40 --- *** Bug 33918 has been marked as a duplicate of this bug. *** -- rask at gcc dot gnu dot org changed: What|Removed |Added CC||rayer at seznam dot cz http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18353
[Bug target/33918] GCC failed to produce assembler output with -masm=intel option
--- Comment #3 from rask at gcc dot gnu dot org 2007-10-27 13:40 --- This is not a regression (as far as I know), so it won't be fixed in anything earlier than 4.3.0. GCC dies trying to figure out which of BYTE PTR, WORD PTR, etc. it should print for a structure. You may be able to work around it by passing (callgate_ptr-offset) instead of (callgate_ptr). *** This bug has been marked as a duplicate of 18353 *** -- rask at gcc dot gnu dot org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||DUPLICATE http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33918
[Bug target/33132] m32r: ICE: RTL check: expected elt 0 type 'i' or 'n', have 'w' (rtx const_int) in insn_current_length, at insn-attrtab.c:29
--- Comment #6 from rask at gcc dot gnu dot org 2007-10-27 15:24 --- Tested revision 129548 which works. -- rask at gcc dot gnu dot org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33132
[Bug target/12081] Gcc can't be compiled with -mregparm=3
--- Comment #7 from rask at gcc dot gnu dot org 2007-10-27 23:16 --- Created an attachment (id=14417) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14417action=view) possible patch Please give this patch a try. I need it to build GCC with OpenWatcom, which wants parameters on the stack by default. -- rask at gcc dot gnu dot org changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |rask at gcc dot gnu dot org |dot org | Status|NEW |ASSIGNED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12081
[Bug target/12081] Gcc can't be compiled with -mregparm=3
--- Comment #9 from rask at gcc dot gnu dot org 2007-10-27 23:36 --- It happens to work because all the compilers people use to build GCC pass varargs the same way as non-varargs, at least for the number of arguments received by the gen_* functions. IOW you shouldn't see any speed difference if it works for you already. -- rask at gcc dot gnu dot org changed: What|Removed |Added Known to fail|4.3.0 | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12081
[Bug target/28583] [4.2 regression] ICE in default_secondary_reload, at targhooks.c:532 when building libgcc2.c as _divsc3.o
--- Comment #7 from rask at gcc dot gnu dot org 2007-10-25 18:04 --- This works fine in 4.3. Looking at the commit log, I'd say it was fixed by revision 121981. -- rask at gcc dot gnu dot org changed: What|Removed |Added Known to work||4.3.0 Summary|[4.2/4.3 regression] ICE in |[4.2 regression] ICE in |default_secondary_reload, at|default_secondary_reload, at |targhooks.c:532 when|targhooks.c:532 when |building libgcc2.c as |building libgcc2.c as |_divsc3.o |_divsc3.o http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28583
[Bug target/30801] [4.3 Regression] performance regression on uint64_t operations
--- Comment #3 from rask at gcc dot gnu dot org 2007-10-25 18:58 --- I see a substantial improvent when testing on the compile farm hardware: processor : 3 vendor_id : AuthenticAMD cpu family : 15 model : 65 model name : Dual-Core AMD Opteron(tm) Processor 2212 stepping: 3 cpu MHz : 2000.240 cache size : 1024 KB ... $ gcc --version | head -n 1 gcc (GCC) 4.1.2 20061115 (prerelease) (Debian 4.1.1-21) $ gcc -O3 ~/pr30801.c time ./a.out 064069fbc13963b920219c3e939225e38e38e38e3956d81c71c71c71c0ba0f00 real0m0.555s user0m0.552s sys 0m0.004s $ (cd ~/build/gcc-x86_64-unknown-linux-gnu/gcc ./xgcc --version | head -n 1) xgcc (GCC) 4.3.0 20071022 (experimental) $ (cd ~/build/gcc-x86_64-unknown-linux-gnu/gcc ./xgcc -B./ -O3 ~/pr30801.c time ./a.out) 064069fbc13963b920219c3e939225e38e38e38e3956d81c71c71c71c0ba0f00 real0m0.455s user0m0.452s sys 0m0.004s Note that your -march=pentium4 option is rejected without -m32: $ gcc -march=pentium4 -O3 ~/pr30801.c time ./a.out /home/rask/pr30801.c:1: error: CPU you selected does not support x86-64 instruction set /home/rask/pr30801.c:1: error: CPU you selected does not support x86-64 instruction set $ gcc -O3 ~/pr30801.c -m32 -march=pentium4 time ./a.out 064069fbc13963b920219c3e939225e38e38e38e3956d81c71c71c71c0ba0f00 real0m2.234s user0m2.232s sys 0m0.004s $ (cd ~/build/gcc-x86_64-unknown-linux-gnu/gcc ./xgcc -B./ -O3 ~/pr30801.c -m32 -march=pentium4 time ./a.out) 064069fbc13963b920219c3e939225e38e38e38e3956d81c71c71c71c0ba0f00 real0m1.488s user0m1.484s sys 0m0.004s So GCC 4.3 is 22 % faster with just the default -m64 + no -march and an impressive 50 % faster with -m32 -march=pentium4. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30801
[Bug target/29493] -masm=intel - does not emit right asm code
--- Comment #5 from rask at gcc dot gnu dot org 2007-10-22 11:51 --- Subject: Bug 29493 Author: rask Date: Mon Oct 22 11:50:56 2007 New Revision: 129548 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=129548 Log: PR target/29473 PR target/29493 * config/i386/i386.c (output_pic_addr_const): Support Intel asm syntax. (print_reg): Print register prefix only with ATT asm syntax. Support pc_rtx for RIP register. (print_operand_address): Use print_reg()'s pc_rtx support for RIP relative addressing. Always print segment register prefix with ATT asm syntax and never with Intel asm syntax. (print_operand): Suppress 'XXX PTR' prefix for BLKmode operands. Fix prefix for 16-byte XFmode operands. (output_addr_const_extra): Support Intel asm syntax. (x86_file_start): Don't use register prefix with Intel asm syntax. * config/i386/i386.md (*zero_extendqihi2_movzbl): Fix typo. (return_internal_long): Fix Intel asm syntax output. (set_got_rex64): Support Intel asm syntax. (set_rip_rex64): Likewise. (set_got_offset_rex64): Likewise. (*sibcall_1_rex64_v): Print register prefix only with ATT asm syntax. (*tls_global_dynamic_64): Likewise. (*tls_local_dynamic_base_64): Likewise. (*load_tp_si)(*load_tp_di): Likewise. (*add_tp_si)(*add_tp_di): Likewise. (*tls_dynamic_lea_64): Likewise. (*sibcall_value_1_rex64_v): Likewise. (stack_tls_protect_set_si): Likewise. (stack_tls_protect_set_di): Likewise. (stack_tls_protect_test_si): Likewise. (stack_tls_protect_test_di): Likewise. * config/i386/mmx.md (*movmode_internal_rex64): Fix Intel asm syntax output. (*movv2sf_internal_rex64): Likewise. * config/i386/cpuid.h (__cpuid): Support Intel asm syntax. (__get_cpuid_max): Likewise. Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/cpuid.h trunk/gcc/config/i386/i386.c trunk/gcc/config/i386/i386.md trunk/gcc/config/i386/mmx.md -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29493
[Bug target/29473] -masm=intel combined with -march=athlon64 has some issues.
--- Comment #12 from rask at gcc dot gnu dot org 2007-10-22 11:51 --- Subject: Bug 29473 Author: rask Date: Mon Oct 22 11:50:56 2007 New Revision: 129548 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=129548 Log: PR target/29473 PR target/29493 * config/i386/i386.c (output_pic_addr_const): Support Intel asm syntax. (print_reg): Print register prefix only with ATT asm syntax. Support pc_rtx for RIP register. (print_operand_address): Use print_reg()'s pc_rtx support for RIP relative addressing. Always print segment register prefix with ATT asm syntax and never with Intel asm syntax. (print_operand): Suppress 'XXX PTR' prefix for BLKmode operands. Fix prefix for 16-byte XFmode operands. (output_addr_const_extra): Support Intel asm syntax. (x86_file_start): Don't use register prefix with Intel asm syntax. * config/i386/i386.md (*zero_extendqihi2_movzbl): Fix typo. (return_internal_long): Fix Intel asm syntax output. (set_got_rex64): Support Intel asm syntax. (set_rip_rex64): Likewise. (set_got_offset_rex64): Likewise. (*sibcall_1_rex64_v): Print register prefix only with ATT asm syntax. (*tls_global_dynamic_64): Likewise. (*tls_local_dynamic_base_64): Likewise. (*load_tp_si)(*load_tp_di): Likewise. (*add_tp_si)(*add_tp_di): Likewise. (*tls_dynamic_lea_64): Likewise. (*sibcall_value_1_rex64_v): Likewise. (stack_tls_protect_set_si): Likewise. (stack_tls_protect_set_di): Likewise. (stack_tls_protect_test_si): Likewise. (stack_tls_protect_test_di): Likewise. * config/i386/mmx.md (*movmode_internal_rex64): Fix Intel asm syntax output. (*movv2sf_internal_rex64): Likewise. * config/i386/cpuid.h (__cpuid): Support Intel asm syntax. (__get_cpuid_max): Likewise. Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/cpuid.h trunk/gcc/config/i386/i386.c trunk/gcc/config/i386/i386.md trunk/gcc/config/i386/mmx.md -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29473
[Bug target/29473] -masm=intel combined with -march=athlon64 has some issues.
--- Comment #13 from rask at gcc dot gnu dot org 2007-10-22 13:06 --- Fixed as of revision 129548. -- rask at gcc dot gnu dot org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29473
[Bug target/29493] -masm=intel - does not emit right asm code
--- Comment #6 from rask at gcc dot gnu dot org 2007-10-22 13:07 --- Fixed as of revision 129548. -- rask at gcc dot gnu dot org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29493
[Bug target/14582] [asm=intel] float to unsigned int conversion fills only 16 of 32 bits
--- Comment #3 from rask at gcc dot gnu dot org 2007-10-22 13:47 --- http://sourceware.org/ml/binutils/2004-06/msg00419.html It was bug in gas. The testcase works these days. -- rask at gcc dot gnu dot org changed: What|Removed |Added Status|NEW |RESOLVED Resolution||INVALID http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14582
[Bug target/18353] ICE with movaps in inline asm when using -masm=intel
--- Comment #3 from rask at gcc dot gnu dot org 2007-10-22 14:00 --- This works with revision 129548, which I think is the one that fixed it. -- rask at gcc dot gnu dot org changed: What|Removed |Added Status|NEW |RESOLVED Known to fail|3.3.3 3.4.3 4.0.0 4.1.0 |3.3.3 3.4.3 4.0.0 4.1.0 |4.2.0 4.3.0 |4.2.0 Known to work||4.3.0 Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18353
[Bug pch/33829] New: HOST_HOOKS_GT_PCH_GET_ADDRESS is missing from the documentation index
-- Summary: HOST_HOOKS_GT_PCH_GET_ADDRESS is missing from the documentation index Product: gcc Version: 4.3.0 Status: UNCONFIRMED Keywords: documentation Severity: minor Priority: P3 Component: pch AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: rask at gcc dot gnu dot org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33829
[Bug c/33782] New: [4.3 Regression] FAIL: gcc.c-torture/compile/limits-stringlit.c (test for excess errors)
This started happening between revision 128772 (good) and 128824 (bad): .../gcc/testsuite/gcc.c-torture/compile/limits-stringlit.c:10: error: size of array is too large This appears to happen on all targets with 16-bit int. Revision 128811 suspected. -- Summary: [4.3 Regression] FAIL: gcc.c-torture/compile/limits- stringlit.c (test for excess errors) Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: rask at gcc dot gnu dot org GCC target triplet: m32c-unknown-elf avr-unknown-none http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33782
[Bug c/33629] bad code with -O2 if pointer dereference followed by null test
--- Comment #3 from rask at gcc dot gnu dot org 2007-10-03 22:13 --- You could open a request for a warning when a null pointer check is optimized away after dereferencing the pointer. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33629
[Bug c/33629] bad code with -O2 if pointer dereference followed by null test
--- Comment #1 from rask at gcc dot gnu dot org 2007-10-02 18:08 --- This behaviour is as documented. http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html -- rask at gcc dot gnu dot org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||INVALID http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33629
[Bug libstdc++/33603] configuration failure during native build
--- Comment #1 from rask at gcc dot gnu dot org 2007-09-30 20:35 --- Please look in your config.log for messages from collect2 and post the last linker failure one plus any that look wrong. -- rask at gcc dot gnu dot org changed: What|Removed |Added Keywords||build http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33603
[Bug bootstrap/25672] [4.1/4.2 regression] cross build's libgcc picks up CFLAGS
--- Comment #18 from rask at gcc dot gnu dot org 2007-09-26 21:06 --- I ran into this with 4.3 a few weeks ago. -- rask at gcc dot gnu dot org changed: What|Removed |Added Known to fail|4.1.2 4.2.0 |4.1.2 4.2.0 4.3.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25672
[Bug target/33551] New: ICE: RTL check: expected elt 0 type 'i' or 'n', have 'w' (rtx const_int) in m32c_immd_dbl_mov, at config/m32c/m32c.c:3010
Revision 128761 fails to compile the test case from bug 32776: $ gcc/xgcc -Bgcc/ -g -O2 -mcpu=m32cm /n/12/rask/dtoa-m32c.c -S -o /dev/null /n/12/rask/src/all/newlib/libc/stdlib/dtoa.c: In function '_dtoa_r': /n/12/rask/src/all/newlib/libc/stdlib/dtoa.c:862: internal compiler error: RTL check: expected elt 0 type 'i' or 'n', have 'w' (rtx const_int) in m32c_immd_dbl_mov, at config/m32c/m32c.c:3010 Configure flags: --target m32c-unknown-elf --enable-checking=yes,rtl --with-newlib --enable-sim --disable-gdb --disable-nls --enable-languages=c,c++ --enable-cxx-flags=-Os -- Summary: ICE: RTL check: expected elt 0 type 'i' or 'n', have 'w' (rtx const_int) in m32c_immd_dbl_mov, at config/m32c/m32c.c:3010 Product: gcc Version: 4.3.0 Status: UNCONFIRMED Keywords: ice-on-valid-code, ice-checking Severity: normal Priority: P3 Component: target AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: rask at gcc dot gnu dot org GCC target triplet: m32c-unknown-elf http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33551
[Bug middle-end/32656] [4.3 regression] m32c: ICE in smallest_mode_for_size, at stor-layout.c:220
--- Comment #8 from rask at gcc dot gnu dot org 2007-09-25 12:05 --- It works for me too at revision 128761. -- rask at gcc dot gnu dot org changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32656
[Bug tree-optimization/33498] Optimizer (-O2) may convert a normal loop to infinite
--- Comment #13 from rask at gcc dot gnu dot org 2007-09-20 09:55 --- Are you telling me that *any* integer overflow allows a compiler to generate a buggy code without any notice ? No, unsigned integer overflow is well defined. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33498
[Bug tree-optimization/33498] [4.2/4.3 Regression] Optimizer (-O2) may convert a normal loop to infinite
--- Comment #3 from rask at gcc dot gnu dot org 2007-09-19 16:12 --- Technically, the code is undefined (overflow of signed integer val). Using -O2 -fno-strict-overflow results in a loop test, but the code looks dubious: table_init: pushl %ebp# 51*pushsi2[length = 1] movl$117835012, %eax# 20*movsi_1/1 [length = 5] movl%esp, %ebp # 52*movsi_1/1 [length = 2] movl$2, %edx# 21*movsi_1/1 [length = 5] movl8(%ebp), %ecx # 14*movsi_1/1 [length = 3] movl$50462976, (%ecx) # 19*movsi_1/2 [length = 6] .p2align 4,,7 .L2: movl%eax, -4(%ecx,%edx,4) # 24*movsi_1/2 [length = 4] addl$67372036, %eax # 26*addsi_1/1 [length = 6] addl$1, %edx# 27*addsi_1/1 [length = 3] cmpl$67305984, %eax # 29*cmpsi_1_insn/1 [length = 6] jne .L2 # 30*jcc_1 [length = 2] popl%ebp# 55popsi1 [length = 1] ret # 56return_internal [length = 1] -O2 -fno-tree-loop-optimize produces code which looks like it might even loop the intended number of times: table_init: pushl %ebp# 43*pushsi2[length = 1] movl$1, %eax# 12*movsi_1/1 [length = 5] movl%esp, %ebp # 44*movsi_1/1 [length = 2] movl$117835012, %edx# 13*movsi_1/1 [length = 5] movl8(%ebp), %ecx # 6 *movsi_1/1 [length = 3] movl$50462976, (%ecx) # 11*movsi_1/2 [length = 6] .p2align 4,,7 .L2: movl%edx, (%ecx,%eax,4) # 16*movsi_1/2 [length = 3] addl$1, %eax# 20*addsi_1/1 [length = 3] addl$67372036, %edx # 18*addsi_1/1 [length = 6] cmpl$63, %eax # 21*cmpsi_1_insn/1 [length = 3] jle .L2 # 22*jcc_1 [length = 2] popl%ebp# 47popsi1 [length = 1] ret # 48return_internal [length = 1] -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33498
[Bug tree-optimization/33498] [4.2/4.3 Regression] Optimizer (-O2) may convert a normal loop to infinite
--- Comment #5 from rask at gcc dot gnu dot org 2007-09-19 16:39 --- table_init: pushl %ebp# 51*pushsi2[length = 1] movl$117835012, %eax# 20*movsi_1/1 [length = 5] movl%esp, %ebp # 52*movsi_1/1 [length = 2] movl$2, %edx# 21*movsi_1/1 [length = 5] movl8(%ebp), %ecx # 14*movsi_1/1 [length = 3] movl$50462976, (%ecx) # 19*movsi_1/2 [length = 6] .p2align 4,,7 .L2: movl%eax, -4(%ecx,%edx,4) # 24*movsi_1/2 [length = 4] addl$67372036, %eax # 26*addsi_1/1 [length = 6] addl$1, %edx# 27*addsi_1/1 [length = 3] cmpl$67305984, %eax # 29*cmpsi_1_insn/1 [length = 6] jne .L2 # 30*jcc_1 [length = 2] popl%ebp# 55popsi1 [length = 1] ret # 56return_internal [length = 1] GCC 4.2.1, btw. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33498