[Bug rtl-optimization/62146] CSE replaces constant with an expression incorrectly

2014-09-08 Thread eraman at google dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62146

Easwaran Raman  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Easwaran Raman  ---
Google ref: b/16870586


[Bug rtl-optimization/62146] CSE replaces constant with an expression incorrectly

2014-08-14 Thread eraman at google dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62146

--- Comment #1 from Easwaran Raman  ---
Created attachment 1
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=1&action=edit
Patch to remove obsolete REG_EQUAL note during RTL copy prop

This patch kills the REG_EQUAL note during cprop when the propagated value is a
constant. This doesn't the CSE problem that conflates 0 and the symbol.


[Bug rtl-optimization/62146] New: CSE replaces constant with an expression incorrectly

2014-08-14 Thread eraman at google dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62146

Bug ID: 62146
   Summary: CSE replaces constant with an expression incorrectly
   Product: gcc
   Version: 4.9.2
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: eraman at google dot com

Created attachment 0
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=0&action=edit
preprocessed test case obtained using creduce

$ ./g++_4_9 --version
g++_4_9 (GCC) 4.9.2 20140814 (prerelease)

$ ./g++_4_9 test.ii -std=gnu++11-O2 -S

Looking at the test.s, we see this following fragment in _ZN2C13fooEv:

.L8:
xorl%esi, %esi
testq   %rbp, %rbp
je  .L7
movl$16, %edi
call_Znwm
movq$_ZTV1GIN2C19TokenTypeEE+16, (%rax)
movq$_ZN2C19TokenType8AddTokenEv, 8(%rax)
movq%rax, %rsi


The _Znwm corresponds to new G < CL > ( 0) in test.ii:22 Once the object is
allocated, the code stores $_ZN2C19TokenType8AddTokenEv at offset 8. This is
incorrect - it should store 0 there. Codegen is fine at -O1. Corresponding code
with -O1:

.L9:
movl$0, %esi
testq   %rbp, %rbp
je  .L8
movl$16, %edi
call_Znwm
movq$_ZTV1GIN2C19TokenTypeEE+16, (%rax)
movq$0, 8(%rax)
movq%rax, %rsi

There seems to be multiple issues here. I am not able to reproduce the issue in
trunk, but I suspect it is simply hidden by some other transformation.

The first CSE pass incorrectly replaces 0 with _ZTV1GIN2C19TokenTypeEE. We have
this code before rtl-cse1:
...
(insn 18 17 19 4 (set (reg/f:DI 90)
(symbol_ref/i:DI ("_ZN2C19TokenType8AddTokenEv") [flags 0x1] 
)) test.ii:22 89 {*movdi_internal}
 (nil))
(insn 19 18 20 4 (set (reg:CCZ 17 flags)
(compare:CCZ (reg/f:DI 90)
(const_int 0 [0]))) test.ii:22 4 {*cmpdi_ccno_1}
 (nil))
(jump_insn 20 19 21 4 (set (pc)
(if_then_else (eq (reg:CCZ 17 flags)
(const_int 0 [0]))
(label_ref:DI 48)
(pc))) test.ii:22 596 {*jcc_1}
 (int_list:REG_BR_PROB 2165 (nil))
 -> 48)
...
(code_label 48 8 47 6 9 "" [1 uses])
(note 47 48 7 6 [bb 6] NOTE_INSN_BASIC_BLOCK)
(insn 7 47 27 6 (set (reg/f:DI 88 [ D.2494 ])
(const_int 0 [0])) test.ii:22 89 {*movdi_internal}
 (nil))

In this EBB, CSE first looks at insn 18 and concludes reg 90 and
symbol_ref/i:DI ("_ZN2C19TokenType8AddTokenEv") are equal. Then it looks at the
branch and concludes that reg 90 has to be 0 in BB 6 and puts 0 also in the
same euqivalent class. Then it looks at insn 7 and replaces the const_int 0
with reference to the _ZTV1GIN2C19TokenTypeEE symbol. This shouldn't happen. 

Next, PRE commons out this symbol into a register and puts a REG_EQUAL note on
this insn to say the src reg is the equal to _ZN2C19TokenType8AddTokenEv.

(insn 52 47 27 6 (set (reg/f:DI 88 [ D.2494 ])
(reg/f:DI 91 [ D.2494 ])) test.ii:22 -1
 (expr_list:REG_EQUAL (symbol_ref/i:DI ("_ZN2C19TokenType8AddTokenEv")
[flags 0x1]  )
(nil)))


 rtl-cprop2 then replaces the register 91 back with 0. However it leaves the
REG_EQUAL note on that insn 52. 

So far, this shouldn't affect anything as all these happen in a dead BB (the
branch above is never taken). However, CE hoists insn 52 above the branch:

(insn 52 17 19 4 (set (reg/f:DI 88 [ D.2494 ])
(const_int 0 [0])) test.ii:22 89 {*movdi_internal}
 (expr_list:REG_EQUAL (symbol_ref/i:DI ("_ZN2C19TokenType8AddTokenEv")
[flags 0x1]  )
(nil)))
(insn 19 52 20 4 (set (reg:CCZ 17 flags)
(compare:CCZ (reg/f:DI 91 [ D.2494 ])
(const_int 0 [0]))) test.ii:22 4 {*cmpdi_ccno_1}
 (expr_list:REG_EQUAL (compare:CCZ (symbol_ref/i:DI
("_ZN2C19TokenType8AddTokenEv") [flags 0x1]  )
(const_int 0 [0]))
(nil)))
(jump_insn 20 19 21 4 (set (pc)
(if_then_else (eq (reg:CCZ 17 flags)
(const_int 0 [0]))
(label_ref:DI 27)
(pc))) test.ii:22 596 {*jcc_1}
 (expr_list:REG_DEAD (reg:CCZ 17 flags)
(int_list:REG_BR_PROB 2165 (nil)))
 -> 27)


Now, a later CSE pass (rtl-cse2) looks at the EBB containing the above jump
(insn 19) and the *fallthru* and thinks that 0 and _ZN2C19TokenType8AddTokenEv
are equal based on the REG_EQUAL note. It changes the following insn in the
fallthru

(insn 26 25 8 7 (set (mem/f:DI (plus:DI (reg/f:DI 87 [ D.2493 ])
(const_int 8 [0x8])) [4 MEM[(struct G *)_13].member_+0 S8 A64])
(const_int 0 [0])) test.ii:10 89 {*movdi_internal}
 (nil))

to

(insn 26 25 8 7 (set (mem/f:DI (plus:DI (reg/f:DI 87 [ D.2493 ])
(const_int 8 [0x8])) [4 MEM[(struct G *)_13].member_+0 S8 A64])
(symbo

[Bug c++/59031] New: vtable lookup not optimized away

2013-11-06 Thread eraman at google dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59031

Bug ID: 59031
   Summary: vtable lookup not optimized away
   Product: gcc
   Version: 4.8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: eraman at google dot com
CC: jason at redhat dot com

Created attachment 31175
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=31175&action=edit
Proposed patch

The fix to PR c++/11750 at r193504 caused a regression in the following code at
-O0:

class B {
 public:
  virtual int add (int a, int b) {return a+ b;}
};

class D : public B {
};

int foo (int a, int b) {
  D d;
  return d.add(a, b);
}

The call d.add(a, b) used to generate a direct call to B::add, but now
generates an indirect call. At -O2, FRE could devirtualize this in some
situations but not always. The attached patch fixes this case and bootstraps
fine. Is this a reasonable fix?


[Bug c++/33911] attribute deprecated vs. templates

2013-09-30 Thread eraman at google dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33911

Easwaran Raman  changed:

   What|Removed |Added

 CC||eraman at google dot com

--- Comment #10 from Easwaran Raman  ---
For template member functions, is concatenating the parsed attributes to
prefix_attributes and passing them to grokfield a valid fix? This patch works
for a small test case I have and I wonder if this is correct:

Index: gcc/cp/parser.c
===
--- gcc/cp/parser.c(revision 202261)
+++ gcc/cp/parser.c(working copy)
@@ -16489,7 +16489,7 @@ cp_parser_init_declarator (cp_parser* parser,
   decl = grokfield (declarator, decl_specifiers,
 initializer, !is_non_constant_init,
 /*asmspec=*/NULL_TREE,
-prefix_attributes);
+chainon (prefix_attributes, attributes));
   if (decl && TREE_CODE (decl) == FUNCTION_DECL)
 cp_parser_save_default_args (parser, decl);
 }


[Bug middle-end/57393] [4.9 Regression] error: definition in block 4 follows the use / internal compiler error: verify_ssa failed

2013-08-29 Thread eraman at google dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57393

Easwaran Raman  changed:

   What|Removed |Added

  Attachment #30690|0   |1
is obsolete||

--- Comment #30 from Easwaran Raman  ---
Created attachment 30727
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30727&action=edit
New patch

This fixes the ICE. Bootstraps ok with this patch, but haven't run the tests.


[Bug middle-end/57393] [4.9 Regression] error: definition in block 4 follows the use / internal compiler error: verify_ssa failed

2013-08-28 Thread eraman at google dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57393

--- Comment #27 from Easwaran Raman  ---
These two test cases pass for me (compiles with -O3) with the attached patch
(http://gcc.gnu.org/bugzilla/attachment.cgi?id=30690). 
gcc --version returns:

gcc (GCC) 4.9.0 20130821 (experimental)

At what revision are you still getting the ICEs? The patch attached to this bug
subsumes the one posted in
http://gcc.gnu.org/ml/gcc-patches/2013-07/msg01584.html .

(In reply to Joost VandeVondele from comment #24)
> Also the other 'dup' PRs still fail (gcc -O3) . Collecting testcases here:
> 
> > cat  PR58018.c
> 
> int a, b, c, d, e;
> 
> void bar (int p)
> {
>   int f = b;
>   e &= p <= (f ^= 0);
> }
> 
> void foo ()
> {
>   for (; d; d++)
> {
>   bar (a && c);
>   bar (0);
>   bar (1);
> }
> }
> 
> 
> > cat PR58131.c
> 
> short a;
> int b, c;
> int d[1][4][2];
> 
> void foo ()
> {
>   int *e;
>   for (b = 1;; b--)
> {
>   if (*e)
>   break;
>   for (c = 2; c >= 0; c--)
> {
> *e |= d[0][3][b] != a;
> int *f = &d[0][3][b];
> *f = 0;
> }
> }
> }

[Bug middle-end/57393] [4.9 Regression] error: definition in block 4 follows the use / internal compiler error: verify_ssa failed

2013-08-22 Thread eraman at google dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57393

--- Comment #22 from Easwaran Raman  ---
(In reply to Marek Polacek from comment #20)
> Yes, the patch maybe fixes the debuginfo issue, but there's something else
> that is wrong.  E.g., on the testcase from PR58018, we have in
> reassociate_bb *after*
> (and that is important) optimize_range_tests this:
> 
> :
> [...]
> e.1_16 = _14 & e.0_15;
> _17 = f_12 >= 0;
> _18 = (int) _17;
> e.1_19 = e.1_16 & _18;
> _20 = f_12 > 0;
> _23 = f_12 > 0;
> _24 = (int) _23;
> _21 = (int) _20;
> e.1_22 = e.1_19 & _21;
> [...]
> 
> Now, in reassociate_bb, we go over the stmts, from the last stmt to the
> first stmt in the bb.  For the appropriate stmts, we call rewrite_expr_tree
> to rewrite the linearized statements according to the operand_entry_t ops
> vector, in this case we call it on
>   e.1_22 = e.1_19 & _21;
> and the vector ops contains
>   Op 0 -> rank: 589826, tree: _14
>   Op 1 -> rank: 3, tree: _24
>   Op 2 -> rank: 1, tree: e.0_15
> 
> In rewrite_expr_tree, we recursively call this function on e.1_19, whose
> SSA_NAME_DEF_STMT is
>   e.1_19 = e.1_16 & _18;
> This stmt is then transformed into
>   e.1_19 = _24 & e.0_15;
> 
> But, at the point where e.1_19 is defined, the _24 isn't defined yet!
> 
> So, it seems, ensure_ops_are_available should handle a situation like this. 
> However, it doesn't: perhaps the issue is that find_insert_point won't find
> the right insert point (the stmt is e.1_19 = e.1_16 & _18;, the dep_stmt is
> _24 = (int) _23;), in there we have:
> 
>   if (gimple_uid (insert_stmt) == gimple_uid (dep_stmt)
>   && gimple_bb (insert_stmt) == gimple_bb (dep_stmt)
>   && insert_stmt != dep_stmt)
> insert_stmt = appears_later_in_bb (insert_stmt, dep_stmt);
>   else if (not_dominated_by (insert_stmt, dep_stmt))
> insert_stmt = dep_stmt;
>   return insert_stmt;
> 
> Neither of these condition holds; gimple_uid of the dep_stmt is 0 and of
> insert_stmt it is 16.  Thus, find_insert_point returns e.1_19 = e.1_16 &
> _18;.  That's wrong, I suppose.
> Maybe the issue is that if the two stms are in the same bb, we just look at
> their UIDs and based on that we find out the dependency, but the new stms
> coming  from optimize_range_tests don't have gimple UIDs set, thus this
> can't work.
> Likely I'm wrong, would appreciate if someone could shed some light on this.
> 
> Looking into it more...

The problem with this test case is that there is a statement with uid 0 that is
being compared. The assumption was every stmt will have a UID in a
monotonically non-decreasing order. This is broken here because
force_gimple_operand_gsi generates new stmts that don't have a UID. The
proposed patch generates UIDs for these newly generated statements but I think
this is a bit ugly and fragile now.


[Bug middle-end/57393] [4.9 Regression] error: definition in block 4 follows the use / internal compiler error: verify_ssa failed

2013-08-22 Thread eraman at google dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57393

--- Comment #21 from Easwaran Raman  ---
Created attachment 30690
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30690&action=edit
Proposed patch


[Bug middle-end/57393] [4.9 Regression] error: definition in block 4 follows the use / internal compiler error: verify_ssa failed

2013-08-20 Thread eraman at google dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57393

--- Comment #18 from Easwaran Raman  ---
Could you confirm if the patch in
http://gcc.gnu.org/ml/gcc-patches/2013-07/msg01584.html fix this? I am waiting
for someone to review that patch.


[Bug rtl-optimization/57878] Incorrect code: live register clobbered in split2

2013-07-12 Thread eraman at google dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57878

--- Comment #2 from Easwaran Raman  ---
After IRA, we have:

(insn 116 115 117 6 (set (reg:DI 130 [ D.3288 ])
(mem:DI (plus:SI (reg/v/f:SI 172 [orig:109 __first ] [109])
(const_int 4 [0x4])) [10 MEM[base: _1, index: _44, offset: 0]+0
S8 A64])) r.ii:197 88 {*movdi_internal}
 (expr_list:REG_EQUIV (mem:DI (plus:SI (reg/v/f:SI 172 [orig:109 __first ]
[109])
(const_int 4 [0x4])) [10 MEM[base: _1, index: _44, offset: 0]+0
S8 A64])
(nil)))
(insn 117 116 118 6 (parallel [
(set (reg/f:SI 138 [ D.3281 ])
(minus:SI (reg/v/f:SI 173 [orig:110 __cur ] [110])
(reg/v/f:SI 103 [ __cur ])))
(clobber (reg:CC 17 flags))
]) 309 {*subsi_1}
 (expr_list:REG_UNUSED (reg:CC 17 flags)
(nil)))
(insn 118 117 119 6 (parallel [
(set (reg/f:SI 140 [ D.3282 ])
(plus:SI (reg/v/f:SI 103 [ __cur ])
(const_int 4 [0x4])))
(clobber (reg:CC 17 flags))
]) 273 {*addsi_1}
 (expr_list:REG_UNUSED (reg:CC 17 flags)
(nil)))
(insn 119 118 120 6 (set (mem:DI (plus:SI (reg/v/f:SI 173 [orig:110 __cur ]
[110])
(const_int 4 [0x4])) [10 MEM[base: _75, index: _77, offset:
0B]+0 S8 A64])
(reg:DI 130 [ D.3288 ])) r.ii:197 88 {*movdi_internal}
 (expr_list:REG_DEAD (reg:DI 130 [ D.3288 ])
(nil)))
(insn 120 119 121 6 (set (reg:DI 131 [ D.3287 ])
(mem:DI (plus:SI (plus:SI (reg/f:SI 99 [ D.3281 ])
(reg/f:SI 126 [ D.3282 ]))
(const_int 8 [0x8])) [10 MEM[base: _1, index: _44, offset: 8]+0
S8 A64])) r.ii:197 88 {*movdi_internal}
 (expr_list:REG_EQUIV (mem:DI (plus:SI (plus:SI (reg/f:SI 99 [ D.3281 ])
(reg/f:SI 126 [ D.3282 ]))
(const_int 8 [0x8])) [10 MEM[base: _1, index: _44, offset: 8]+0
S8 A64])
(nil)))
(insn 121 120 122 6 (set (mem:DI (plus:SI (plus:SI (reg/f:SI 138 [ D.3281 ])
(reg/f:SI 140 [ D.3282 ]))
(const_int 8 [0x8])) [10 MEM[base: _75, index: _77, offset:
8B]+0 S8 A64])
(reg:DI 131 [ D.3287 ])) r.ii:197 88 {*movdi_internal}
 (expr_list:REG_DEAD (reg:DI 131 [ D.3287 ])
(nil)))


After reload,
 1. insn 116 is deleted
 2. In insn 117, the pseudo 138 is replaced with dx
 3. dx is spilled into stack at offset -0x36 from bp.
 4. For insn 119, first a new pseudo 193 is created which is equivalent to 130
and is loaded from memory. This 193 is in DI mode and is replaced by ax. This
would clobber edx, but that is ok since dx is now stored into stack at offset
-0x36.
 5. This is followed by the the store of 193 (ax) into memory location.
 6. Then dx is loaded from bp-0x36.
 7. insn 120 is deleted
 8. For insn 121, the pseudo 131 is replaced by 196 which is assigned the hard
reg ax. 

In the sequence above, shouldn't the fill of dx from bp-0x36 at step 6 above
happen after step 8?


[Bug rtl-optimization/57878] Incorrect code: live register clobbered in split2

2013-07-10 Thread eraman at google dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57878

--- Comment #1 from Easwaran Raman  ---
Created attachment 30494
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30494&action=edit
Disassembly of the compiled r.ii


[Bug rtl-optimization/57878] New: Incorrect code: live register clobbered in split2

2013-07-10 Thread eraman at google dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57878

Bug ID: 57878
   Summary: Incorrect code: live register clobbered in split2
   Product: gcc
   Version: 4.8.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: eraman at google dot com

Created attachment 30493
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30493&action=edit
Preprocessed source of the test case

$ ./g++_4_8 --version
g++_4_8 (GCC) 4.8.1
Copyright (C) 2013 Free Software Foundation, Inc.

(Compiling the attached preprocessed source)
$ ./g++_4_8 -m32 -O2 -fno-omit-frame-pointer -fPIC -std=gnu++11 -c r.ii
$ objdump -d --no-show-raw-insn r.o  > r.s

In the attached r.s, in function
_ZNSt6vectorIN3FDB9ChunkDataESaIS1_EE19_M_emplace_back_auxIIRKS1_EEEvDpOT_, the
sub instruction at address 0xfd writes to edx, which is subsequently stored at
[rbp-0x24]. edx is immediately clobbered. At 0x119, rbp-0x24 is clobbered and
hence the load at 0x168 loads an incorrect value into ecx. 

After reload, we see the following:

(insn 117 241 196 7 (parallel [
(set (reg/f:SI 1 dx [orig:138 D.3282 ] [138])
(minus:SI (reg/f:SI 1 dx [orig:138 D.3282 ] [138])
(reg/v/f:SI 5 di [orig:103 __cur ] [103])))
(clobber (reg:CC 17 flags))
]) 309 {*subsi_1}
 (nil))
(insn 196 117 268 7 (set (mem/c:SI (plus:SI (reg/f:SI 6 bp)
(const_int -36 [0xffdc])) [21 %sfp+-36 S4 A32])
(reg/f:SI 1 dx [orig:138 D.3282 ] [138])) 89 {*movsi_internal}
 (expr_list:REG_DEAD (reg/f:SI 1 dx [orig:138 D.3282 ] [138])
(nil)))
...
(insn 119 274 237 7 (set (reg:DI 0 ax [193])
(mem:DI (plus:SI (reg/v/f:SI 0 ax [orig:109 __first ] [109])
(const_int 4 [0x4])) [10 MEM[base: _1, index: _28, offset: 0]+0
S8 A64])) r.ii:197 88 {*movdi_internal}
 (expr_list:REG_DEAD (reg/v/f:SI 0 ax [orig:109 __first ] [109])
(nil)))
...
(insn 234 120 200 7 (set (reg/f:SI 1 dx [orig:138 D.3282 ] [138])
(mem/c:SI (plus:SI (reg/f:SI 6 bp)
(const_int -36 [0xffdc])) [21 %sfp+-36 S4 A32]))
r.ii:197 89 {*movsi_internal}
 (expr_list:REG_DEAD (reg/f:SI 138 [ D.3282 ])
(nil)))
...
(insn 232 122 265 7 (set (mem/c:SI (plus:SI (reg/f:SI 6 bp)
(const_int -36 [0xffdc])) [21 %sfp+-36 S4 A32])
(reg/f:SI 1 dx [orig:138 D.3282 ] [138])) r.ii:197 89 {*movsi_internal}
 (nil))

The fill in 234 and the second spill in 232 are redundant. Insn 234 gets
removed by ce3 later, but 234 remains till the end. Meanwhile, 119 writes to a
DI mode register and gets split by split2 into eax and edx and hence the store
in 232 ends up clobbering the right value. 


When compiled with a trunk built 2 weeks ago, the compiler ICEs with the
following stacck trace:

r.ii: In member function ‘void std::vector<_Tp,
_Alloc>::_M_emplace_back_aux(_Args&& ...) [with _Args = {const
FDB::ChunkData&}; _Tp = FDB::ChunkData; _Alloc =
std::allocator]’:
r.ii:192:3: internal compiler error: in assign_by_spills, at lra-assigns.c:1266
   }
   ^
0x9ba360 assign_by_spills
/usr/local/google/home/eraman/gcc/trunk/gcc/lra-assigns.c:1266
0x9bb013 lra_assign()
/usr/local/google/home/eraman/gcc/trunk/gcc/lra-assigns.c:1423
0x9b71b9 lra(_IO_FILE*)
/usr/local/google/home/eraman/gcc/trunk/gcc/lra.c:2342
0x97d7b8 do_reload
/usr/local/google/home/eraman/gcc/trunk/gcc/ira.c:4689
0x97d7b8 rest_of_handle_reload
/usr/local/google/home/eraman/gcc/trunk/gcc/ira.c:4801

[Bug target/57088] Register allocator has an issue with subreg in some cases

2013-05-20 Thread eraman at google dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57088

Easwaran Raman  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #3 from Easwaran Raman  ---
This is a dup of PR rtl-optimizations/57046. Verified that r198263 fixes this
as well.

*** This bug has been marked as a duplicate of bug 57046 ***


[Bug rtl-optimization/57046] [4.8 Regression] wrong code generated by gcc 4.8.0 on i686

2013-05-20 Thread eraman at google dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57046

Easwaran Raman  changed:

   What|Removed |Added

 CC||eraman at google dot com

--- Comment #8 from Easwaran Raman  ---
*** Bug 57088 has been marked as a duplicate of this bug. ***


[Bug tree-optimization/57337] 416.gamess ICE on x86 after r199048

2013-05-20 Thread eraman at google dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57337

--- Comment #1 from Easwaran Raman  ---
Could you please attach the preprocessed file? Thanks.


[Bug target/57088] New: Post-reload instruction splitting clobbers live register

2013-04-26 Thread eraman at google dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57088



 Bug #: 57088

   Summary: Post-reload instruction splitting clobbers live

register

Classification: Unclassified

   Product: gcc

   Version: 4.9.0

Status: UNCONFIRMED

  Severity: normal

  Priority: P3

 Component: target

AssignedTo: unassig...@gcc.gnu.org

ReportedBy: era...@google.com





Created attachment 29952

  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29952

Test case



$ ./trunk_gcc --version

trunk_gcc (GCC) 4.9.0 20130423 (experimental)

$ ./trunk_gcc  -fno-exceptions-m32  -O2   -fpermissive  -fPIC   -S

reduced.ii



In the source, the LargeObjectCacheInterface::Update method has the following

statement:



IOBuffer *save_buf = new VIOBuffer(buffer->Length());



This gets translated to the following assembly fragment:



(...)

call_Znwj@PLT

movl-44(%ebp), %ecx

movl-48(%ebp), %edx

movl$0, 12(%esp)

movl$16, 8(%esp)

movl%ecx, (%esp)

movl%edx, 4(%esp)

movl%ecx, -52(%ebp)

call_ZN8IOBufferC2Eiii@PLT

(...)



Note that the output of operator new is not passed to IOBuffer constructor. 



Here is the RTL from reduced.ii.210r.postreload:



(call_insn 21 20 22 2 (set (reg:SI 0 ax)

(call (mem:QI (symbol_ref:SI ("_Znwj") [flags 0x41]  ) [0 operator new S1 A8])

(const_int 4 [0x4]))) reduced.ii:50 652 {*call_value}

 (nil)

(expr_list:REG_DEP_TRUE (use (reg:SI 3 bx))

(expr_list:REG_BR_PRED (use (mem:SI (reg/f:SI 7 sp) [0 S4 A32]))

(nil

(insn 22 21 23 2 (set (reg/v/f:SI 2 cx [orig:59 save_buf ] [59])

(reg:SI 0 ax)) reduced.ii:50 85 {*movsi_internal}

 (expr_list:REG_DEAD (reg:SI 0 ax)

(nil)))

(insn 23 22 24 2 (set (mem:SI (plus:SI (reg/f:SI 7 sp)

(const_int 12 [0xc])) [0 S4 A32])

(const_int 0 [0])) reduced.ii:41 85 {*movsi_internal}

 (nil))

(insn 24 23 59 2 (set (mem:SI (plus:SI (reg/f:SI 7 sp)

(const_int 8 [0x8])) [0 S4 A32])

(const_int 16 [0x10])) reduced.ii:41 85 {*movsi_internal}

 (nil))

(insn 59 24 25 2 (set (reg:DI 1 dx [orig:78 n ] [78])

(mem/c:DI (plus:SI (reg/f:SI 6 bp)

(const_int -48 [0xffd0])) [9 %sfp+-48 S8 A64]))

reduced.ii:41 84 {*movdi_internal}

 (expr_list:REG_DEAD (reg:DI 92)

(nil)))

(insn 25 59 26 2 (set (mem:SI (plus:SI (reg/f:SI 7 sp)

(const_int 4 [0x4])) [0 S4 A32])

(reg:SI 1 dx [orig:78 n ] [78])) reduced.ii:41 85 {*movsi_internal}

 (expr_list:REG_DEAD (reg:DI 1 dx [orig:78 n ] [78])

(nil)))

(insn 26 25 56 2 (set (mem:SI (reg/f:SI 7 sp) [0 S4 A32])

(reg/v/f:SI 2 cx [orig:59 save_buf ] [59])) reduced.ii:41 85

{*movsi_internal}

 (nil))

(insn 56 26 27 2 (set (mem/c:SI (plus:SI (reg/f:SI 6 bp)

(const_int -52 [0xffcc])) [9 %sfp+-52 S4 A32])

(reg/v/f:SI 2 cx [orig:59 save_buf ] [59])) reduced.ii:41 85

{*movsi_internal}

 (expr_list:REG_DEAD (reg/v/f:SI 2 cx [orig:59 save_buf ] [59])

(nil)))

(call_insn 27 56 28 2 (call (mem:QI (symbol_ref:SI ("_ZN8IOBufferC2Eiii")

[flags 0x41]  ) [0 __base_ctor  S1

A8])





The dump from the next pass (split2) shows that insn 59 above is split into:



(insn 69 24 70 2 (set (reg:SI 1 dx [orig:78 n ] [78])

(mem/c:SI (plus:SI (reg/f:SI 6 bp)

(const_int -48 [0xffd0])) [9 %sfp+-48 S4 A64]))

reduced.ii:41 85 {*movsi_internal}

 (nil))

(insn 70 69 25 2 (set (reg:SI 2 cx [ n+4 ])

(mem/c:SI (plus:SI (reg/f:SI 6 bp)

(const_int -44 [0xffd4])) [9 %sfp+-44 S4 A32]))

reduced.ii:41 85 {*movsi_internal}

 (nil))





But here, cx is actually live and the split_after_reload pass clobbers it.


[Bug middle-end/56988] New: ipa-cp incorrectly propagates a field of an aggregate

2013-04-17 Thread eraman at google dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56988



 Bug #: 56988

   Summary: ipa-cp incorrectly propagates a field of an aggregate

Classification: Unclassified

   Product: gcc

   Version: 4.9.0

Status: UNCONFIRMED

  Severity: normal

  Priority: P3

 Component: middle-end

AssignedTo: unassig...@gcc.gnu.org

ReportedBy: era...@google.com





Created attachment 29890

  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29890

Reduced test case



$ trunk_g++ --version

trunk_g++ (GCC) 4.9.0 20130416 (experimental)





$ trunk_g++ -S -O2  -std=c++11 -fno-exceptions upstream_test_case.ii && grep

"mov.* _ZTVN12_GLOBAL__N_18RCTesterE" upstream_test_case.s

movq%rax, _ZTVN12_GLOBAL__N_18RCTesterE+24(%rip)



The generated assembly attempts to write into RCTester class's vtable.



>From the dump generated by -fdump-ipa-whole-program-all (just before ipa-cp),

the caller has the following code:



  # .MEM_11 = VDEF <.MEM_10>

  obj_3->D.2045._vptr.ReferenceCountedD.2013 = &MEM[(voidD.45

*)&_ZTVN12_GLOBAL__N_18RCTesterED.2049 + 16B];

  # .MEM_12 = VDEF <.MEM_11>

  obj_3->destructed_D.2025 = 0B;

  # .MEM_13 = VDEF <.MEM_12>

  obj_3->owner_D.2026 = 0B;

  # .MEM_5 = VDEF <.MEM_13>

  # USE = nonlocal null { D.2015 D.2049 } (glob)

  # CLB = nonlocal null { D.2015 D.2049 } (glob)

  _ZN12_GLOBAL__N_19TestResetEPNS_8RCTesterED.2068 (obj_3);





At the callee, we see:



void {anonymous}::TestReset({anonymous}::RCTester*) (struct RCTesterD.2017 *

objD.2067)

{

  const struct AssertionResultD.1962 gtest_arD.2071;

  boolD.1899 destructedD.2070;

  struct RCTesterD.2017 * obj.3D.2179;



  # .MEM_2 = VDEF <.MEM_1(D)>

  destructedD.2070 = 0;

  # VUSE <.MEM_2>

  # PT = nonlocal escaped 

  obj.3_3 = objD.2067;

  # .MEM_8 = VDEF <.MEM_2>

  MEM[(boolD.1899 * *)obj.3_3 + 8B] = &destructedD.2070;



ipa-cp mistakenly thinks that the move statement

 obj.3_3 = objD.2067;



actually loads from offset 0 of objD.2067 and hence propagates &MEM[(voidD.45

*)&_ZTVN12_GLOBAL__N_18RCTesterED.2049 + 16B] into obj.3_3 which then

subsequently gets propagated to the store of &destructedD.2070. 



The following patch fixes this, but not sure if this could be too restrictive:

Index: gcc/ipa-prop.c

===

--- gcc/ipa-prop.c(revision 197495)

+++ gcc/ipa-prop.c(working copy)

@@ -3892,7 +3892,7 @@ ipcp_transform_function (struct cgraph_node *node)

   {

 struct ipa_agg_replacement_value *v;

 gimple stmt = gsi_stmt (gsi);

-tree rhs, val, t;

+tree rhs, lhs, val, t;

 HOST_WIDE_INT offset;

 int index;

 bool by_ref, vce;

@@ -3900,6 +3900,7 @@ ipcp_transform_function (struct cgraph_node *node)

 if (!gimple_assign_load_p (stmt))

   continue;

 rhs = gimple_assign_rhs1 (stmt);

+lhs = gimple_assign_lhs (stmt);

 if (!is_gimple_reg_type (TREE_TYPE (rhs)))

   continue;



@@ -3924,7 +3925,8 @@ ipcp_transform_function (struct cgraph_node *node)

   continue;

 for (v = aggval; v; v = v->next)

   if (v->index == index

-  && v->offset == offset)

+  && v->offset == offset

+  && TREE_TYPE (v->value) == TREE_TYPE (lhs))

 break;

 if (!v)

   continue;


[Bug middle-end/54957] Two crashes introduced by rev192488

2012-10-17 Thread eraman at google dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54957



--- Comment #11 from Easwaran Raman  2012-10-17 
20:31:21 UTC ---

(In reply to comment #10)

> Created attachment 28467 [details]

> emit_case_dispatch_table testcase

> 

> Here's a csmith generated testcase that crashes with -O0 -fexceptions on

> sh4-unknown-linux-gnu. It's slightly reduced but I can reduce it further by

> hand if necessary.



Does my second patch fix this as well or is it still there?



- Easwaran


[Bug middle-end/54957] Two crashes introduced by rev192488

2012-10-17 Thread eraman at google dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54957



Easwaran Raman  changed:



   What|Removed |Added



  Attachment #28465|0   |1

is obsolete||



--- Comment #8 from Easwaran Raman  2012-10-17 
18:56:14 UTC ---

Created attachment 28466

  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=28466

Proposed patch



Handle the possibility that stmt_bb may be NULL in emit_case_dispatch_table.

Untested.


[Bug middle-end/54957] Two crashes introduced by rev192488

2012-10-17 Thread eraman at google dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54957



--- Comment #6 from Easwaran Raman  2012-10-17 
18:26:30 UTC ---

(In reply to comment #5)

> Created attachment 28465 [details]

> Proposed patch



I haven't tested the patch. Ryan, could you please confirm this patch fixes the

crashes?



Thanks,

Easwaran


[Bug middle-end/54957] Two crashes introduced by rev192488

2012-10-17 Thread eraman at google dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54957



--- Comment #5 from Easwaran Raman  2012-10-17 
18:24:48 UTC ---

Created attachment 28465

  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=28465

Proposed patch


[Bug middle-end/54957] Two crashes introduced by rev192488

2012-10-17 Thread eraman at google dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54957



--- Comment #3 from Easwaran Raman  2012-10-17 
18:08:24 UTC ---

(In reply to comment #0)

> http://gcc.gnu.org/viewcvs?view=revision&revision=192488

> 

> 

> sh4-unknown-linux-gnu no longer builds libgcc.

> 

> 0x7df7df emit_cmp_and_jump_insn_1

> ../../gcc/optabs.c:4273

> 0x7df7df emit_cmp_and_jump_insns(rtx_def*, rtx_def*, rtx_code, rtx_def*,

> machine_mode, int, rtx_def*, int)

> ../../gcc/optabs.c:4324

> 0x6136f6 do_compare_rtx_and_jump(rtx_def*, rtx_def*, rtx_code, int,

> machine_mode, rtx_def*, rtx_def*, rtx_def*, int)

> ../../gcc/dojump.c:1072

> 0x61479b do_compare_and_jump

> ../../gcc/dojump.c:1154

> 0x6164c1 do_jump_1(tree_code, tree_node*, tree_node*, rtx_def*, rtx_def*, int)

> ../../gcc/dojump.c:206

> 0x5ba1de expand_gimple_cond

> ../../gcc/cfgexpand.c:1852

> 0x5c1b9b expand_gimple_basic_block

> ../../gcc/cfgexpand.c:3832

> 0x5c2ec5 gimple_expand_cfg

> ../../gcc/cfgexpand.c:4477

> Please submit a full bug report,

> with preprocessed source if appropriate.

> Please include the complete backtrace with any bug report.

> See  for instructions.

> 



The first one seems a dup of http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54938.

The obvious fix is to remove the assert. Will send out a patch.


[Bug target/54938] sh libgcc_unpack_df.o fails to build: ../../../srcw/libgcc/fp-bit.h:221:19: internal compiler error: in emit_cmp_and_jump_insn_1, at optabs.c:4273

2012-10-16 Thread eraman at google dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54938

--- Comment #4 from Easwaran Raman  2012-10-16 
17:04:05 UTC ---
(In reply to comment #3)
> Thanks Jörn.
> The problem is not related to my changes in PR 51244.  It is caused by the
> latest change to optabs.c:
> 
> 2012-10-15   Easwaran Raman  
> * optabs.c (emit_cmp_and_jump_insn_1): Add a new parameter to
> specificy the probability of taking the jump.
> (emit_cmp_and_jump_insns): Likewise.
> 
> 
> In emit_cmp_and_jump_insn_1, the line
> 
>   gcc_assert (!find_reg_note (insn, REG_BR_PROB, 0));
> 
> blows up, because of config/sh/sh.c (expand_cbranchsi4):
> 
>   rtx jump = emit_jump_insn (branch_expander (operands[3]));
>   if (probability >= 0)
> add_reg_note (jump, REG_BR_PROB, GEN_INT (probability));

I am confused why this code causes the assert in emit_cmp_and_jump_insn_1.
Could you please attach a stack trace? 

> 
> The following seems to fix the problem
> 
> Index: gcc/optabs.c
> ===
> --- gcc/optabs.c(revision 192494)
> +++ gcc/optabs.c(working copy)
> @@ -4270,8 +4270,8 @@
>&& JUMP_P (insn)
>&& any_condjump_p (insn))
>  {
> -  gcc_assert (!find_reg_note (insn, REG_BR_PROB, 0));
> -  add_reg_note (insn, REG_BR_PROB, GEN_INT (prob));
> +  if (!find_reg_note (insn, REG_BR_PROB, 0))
> +add_reg_note (insn, REG_BR_PROB, GEN_INT (prob));
>  }
>  }
> 
> 
> Easwaran, could you please have a look at that?  Does the change above make
> sense?

While this would certainly make the error go away, it will be good to
understand the root cause. If there is a REG_BR_PROB note already but the
probability is different from what is passed to emit_cmp_and_jump_insn_1,
should the existing value be replaced or left as such.

Thanks,
Easwaran


[Bug tree-optimization/49452] [4.7 regression] comp-goto-2.c regresses in testing

2011-07-14 Thread eraman at google dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49452

--- Comment #13 from Easwaran Raman  2011-07-14 
22:10:16 UTC ---
I looked at the dumps for 920501-7.c and second invocation of DSE removes a
necessary store. The relevant dump for function x from
920501-7.c.198r.pro_and_epilogue is below:

(insn 2 58 53 2 (set (mem/c:SI (plus:SI (reg/f:SI 11 fp)
(const_int -56 [0xffc8])) [6 %sfp+-20 S4 A32])
(reg:SI 0 r0 [ a ]))
/scratch/janisjo/arm-linux-fsf/src/gcc-mainline/gcc/testsuite/gcc.reghunt/920501-7.c:12
176 {*arm_movsi_insn}
 (nil))
...

(call_insn/c/i 11 9 12 2 (parallel [
(call (mem:SI (symbol_ref:SI ("y.1271") [flags 0x3]  ) [0 y S4 A32])
(const_int 0 [0]))
(use (const_int 0 [0]))
(clobber (reg:SI 14 lr))
])
/scratch/janisjo/arm-linux-fsf/src/gcc-mainline/gcc/testsuite/gcc.reghunt/920501-7.c:20
242 {*call_symbol}
 (expr_list:REG_NORETURN (const_int 0 [0])
(expr_list:REG_EH_REGION (const_int 0 [0])
(nil)))
(expr_list:REG_DEP_TRUE (use (reg:SI 0 r0))
(expr_list:REG_DEP_TRUE (use (reg:SI 12 ip))
(nil

...
(insn 24 18 30 3 (set (reg/i:SI 0 r0)
(mem/c:SI (plus:SI (reg/f:SI 11 fp)
(const_int -20 [0xffec])) [6 %sfp+-20 S4 A32]))
/scratch/janisjo/arm-linux-fsf/src/gcc-mainline/gcc/testsuite/gcc.reghunt/920501-7.c:23
176 {*arm_movsi_insn}
 (nil))



Instruction 2 and 24 refer to the same location, but have different offset
relative to FP because the call to y changes FP. DSE doesn't (and can not, if
it is intra-procedural) know that they both refer to the same location and
hence thinks insn 2 is dead. 

It seems to me this (FP having different value after the call) can only happen
at postreload. It seems to me that setting wild_read (not just
non_frame_wild_read) on all calls after postreload will fix this problem.
What's the best way to do that? Will checking for clear_alias_sets != NULL
work?


[Bug tree-optimization/49452] [4.7 regression] comp-goto-2.c regresses in testing

2011-07-14 Thread eraman at google dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49452

--- Comment #12 from Easwaran Raman  2011-07-14 
17:16:06 UTC ---
(In reply to comment #11)
> I have confirmed that the -Os failures began with r175063 and that the tests
> pass for several revision before that and pass for several after, so it's
> unlikely to be an intermittent failure.  If it would help I can send dump 
> files
> for r175063 and the one just before that.

It is possible that the second DSE invocation deletes a necessary store. My
understanding was that it only acts on spilled stores and all my changes are in
the _nospill version, but that seems not to be the case. Could you send me all
the RTL dumps with and without this patch as a tar file? That will be very
useful in narrowing it down.

Thanks,
Easwaran


[Bug tree-optimization/49452] comp-goto-2.c regresses in testing

2011-06-24 Thread eraman at google dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49452

--- Comment #10 from Easwaran Raman  2011-06-24 
23:07:58 UTC ---
(In reply to comment #9)
> I still get the -Os failures (I never had the others) with r175389 and have
> attached the requested rtl dumps.

This doesn't look like a DSE related bug to me. The RTL dump shows that no
store has been deleted by DSE in any of the functions.


[Bug tree-optimization/49452] comp-goto-2.c regresses in testing

2011-06-24 Thread eraman at google dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49452

--- Comment #6 from Easwaran Raman  2011-06-24 
22:19:40 UTC ---
Could you please test if r175384 fixes these failures? Otherwise please run one
of the smaller tests with -fdump-rtl-dse1-all and -fdump-rtl-cse2 (the pass
before DSE) and upload those dumps and I will take a look.


[Bug rtl-optimization/49429] [4.7 Regression] dse.c changes to fix PR44194 (r175063) cause execution failures

2011-06-20 Thread eraman at google dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49429

--- Comment #11 from Easwaran Raman  2011-06-20 
18:40:46 UTC ---
(In reply to comment #10)
> With regards to the question in comment #9, you would probably do better 
> asking
> it on the gcc-patches mailing list then in the comment of this bug report 
> since
> more people would see it on the mailing list.

Could you please try out this patch? I don't have the ia-64 host libraries and
have a half-broken cross compiler, but this seems to fix the issue in y.c. 


Index: gcc/expr.c
===
--- gcc/expr.c(revision 175081)
+++ gcc/expr.c(working copy)
@@ -1181,8 +1181,19 @@ emit_block_move_hints (rtx x, rtx y, rtx size, enu
   else if (may_use_call
&& ADDR_SPACE_GENERIC_P (MEM_ADDR_SPACE (x))
&& ADDR_SPACE_GENERIC_P (MEM_ADDR_SPACE (y)))
-retval = emit_block_move_via_libcall (x, y, size,
-  method == BLOCK_OP_TAILCALL);
+{
+  /* Since x and y are passed to a libcall, mark the corresponding
+ tree EXPR as addressable.  */
+  tree y_expr = MEM_EXPR (y);
+  tree x_expr = MEM_EXPR (x);
+  if (y_expr)
+mark_addressable (y_expr);
+  if (x_expr)
+mark_addressable (x_expr);
+  retval = emit_block_move_via_libcall (x, y, size,
+method == BLOCK_OP_TAILCALL);
+}
+
   else
 emit_block_move_via_loop (x, y, size, align);


[Bug rtl-optimization/44194] struct returned by value generates useless stores

2011-06-20 Thread eraman at google dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194

--- Comment #35 from Easwaran Raman  2011-06-20 
16:51:18 UTC ---
(In reply to comment #33)
> > I think these two are totally independent of each other -- one should not be
> > gated against each other. If Eawaran's approach is completely flawed, that 
> > is
> > different story.  With this change, we at least make incremental 
> > improvement.  
> > Not familiar with the rtl expander, but I guess the spilling was there 
> > probably
> > for a deeper reason. If you have an insight, you can of course point it out.
> 
> See comment #22.  It's an incremental improvement, but maybe we can avoid
> wasting time and memory by creating RTXes and Trees that will be thrown away
> immediately after.  I don't really see what we risk by trying.

There is a comment in calls.c that says
   /* Handle calls that return values in multiple non-contiguous locations.
 The Irix 6 ABI has examples of this.  */

I don't know if avoiding the copy breaks that ABI in any way so I didn't try
that approach. In general, if the TARGET is non-NULL, I don't see how the copy
can be avoided (but then, the tree EXPR corresponding to the target hopefully
has the addressable flag set). In this particular case though TARGET is NULL.
Is it just a matter of setting  VALREG  and let expand_assignment deal with it?

Irrespective of how this case is handled, I think there may be other instances
where a store generated during expansion may be redundant, but we don't know it
at the point of generation. In such cases, is this approach of associating a
tree expr with the temp rtx generated by the expanded reasonable?


[Bug rtl-optimization/49429] [4.7 Regression] dse.c changes to fix PR44194 (r175063) cause execution failures

2011-06-17 Thread eraman at google dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49429

--- Comment #9 from Easwaran Raman  2011-06-17 
20:43:04 UTC ---
(In reply to comment #8)
> Compiling x.c with a ia64-unknown-linux cross compiler, setting a breakpoint 
> in
> can_escape(), I see that,
> 
> 
> (gdb) p debug_rtx (body)
> (set (mem/s/c:DI (reg/f:DI 341) [2 s1+0 S8 A64])
> (reg:DI 112 in0))
> 
> This is part of an instruction that gets removed:
> (insn 4 3 6 2 (set (mem/s/c:DI (reg/f:DI 341) [2 s1+0 S8 A64])
> (reg:DI 112 in0)) y.c:23 5 {movdi_internal}
>  (expr_list:REG_DEAD (reg:DI 112 in0)
> (nil)))
> 
> (gdb) p expr->base.code 
> $24 = PARM_DECL
> (gdb) p may_be_aliased (expr)
> $23 = 0 '\000'
> 
> So can_escape() returns false. But later on, in the same BB, I see:
> 
> 
> (insn 36 30 37 2 (set (reg:DI 120 out0)
> (reg/f:DI 357)) 5 {movdi_internal}
>  (expr_list:REG_EQUAL (plus:DI (reg/f:DI 328 sfp)
> (const_int 62 [0x3e]))
> (nil)))
> (insn 37 36 38 2 (set (reg:DI 121 out1)
> (reg/f:DI 341)) 5 {movdi_internal}
>  (expr_list:REG_DEAD (reg/f:DI 341)
> (expr_list:REG_EQUAL (plus:DI (reg/f:DI 328 sfp)
> (const_int 96 [0x60]))
> (nil
> (insn 38 37 39 2 (set (reg:DI 122 out2)
> (const_int 31 [0x1f])) 5 {movdi_internal}
>  (nil))
> (call_insn 39 38 42 2 (parallel [
> (set (reg:DI 8 r8)
> (call (mem:DI (symbol_ref:DI ("memcpy") [flags 0x41] 
> ) [0 memcpy S8 A64])
> (const_int 1 [0x1])))
> (clobber (reg:DI 320 b0))
> (clobber (scratch:DI))
> (clobber (scratch:DI))
> ]) 332 {call_value_gp}
>  (expr_list:REG_DEAD (reg:DI 122 out2)
> (expr_list:REG_DEAD (reg:DI 121 out1)
> (expr_list:REG_DEAD (reg:DI 120 out0)
> (expr_list:REG_UNUSED (reg:DI 8 r8)
> (expr_list:REG_EH_REGION (const_int 0 [0])
> (nil))
> (expr_list:REG_DEP_TRUE (use (reg:DI 1 r1))
> (expr_list:REG_DEP_TRUE (use (reg:DI 122 out2))
> (expr_list:REG_DEP_TRUE (use (reg:DI 121 out1))
> (expr_list:REG_DEP_TRUE (use (reg:DI 120 out0))
> (nil))
> 
> reg 341 is passed as source argument of a memcpy. Why does the expression
> return 0 for may_be_aliased()?

Could someone tell why may_be_aliased returns false in this case? I would
expect TREE_ADDRESSABLE to be true, but that's not the case. It seems to me
some other bug is exposed by the DSE patch.


[Bug rtl-optimization/44194] struct returned by value generates useless stores

2011-06-16 Thread eraman at google dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194

--- Comment #27 from Easwaran Raman  2011-06-16 
17:14:38 UTC ---
(In reply to comment #26)
> On Wed, 15 Jun 2011, xinliangli at gmail dot com wrote:
> 
> > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194
> > 
> > davidxl  changed:
> > 
> >What|Removed |Added
> > 
> >  CC||xinliangli at gmail dot com
> > 
> > --- Comment #23 from davidxl  2011-06-15 
> > 23:14:50 UTC ---
> > (In reply to comment #22)
> > > > The DSE patch still leaves 2 redundant stores.
> > > 
> > > OK, I missed this, reopening...
> > > 
> > > > The following patch will enable DSE to remove those two stores. Does 
> > > > this
> > > > look ok?
> > > 
> > > Calling into the gimplifier from the RTL expander doesn't look 
> > > appropriate.
> 
> It also should use create_tmp_var, not create_tmp_reg.  But I wonder why
> memory allocated via assign_temp isn't marked in a way to let dse
> do its job (I guess dse thinks that memory escapes?).
If the mem rtx doesn't have a tree_expression associated with it, DSE assumes
the memory escapes.

> 
> > > More fundamentally, it's a little unfortunate to spill to memory a value
> > > returned in registers.  Can we try to use emit_group_move_into_temps here
> > > instead (under the appropriate circumstances)?
> > 
> > It would be nice if the expander does not spill the return into memory in 
> > the
> > first place if possible.  On other hand tagging compiler created memory
> > location with temp decls so that aliaser has the symbolic information seems 
> > a
> > useful mechanism. 
> 
> Sure - but I wonder why assign_temp doesn't do something equivalent
> that doesn't require a automatic VAR_DECL to be created.
> 
> Where does the aliaser catch things with the VAR_DECL around that
> it doesn't without it?

Is it just that when I create a VAR_DECL, TREE_ADDRESSABLE is false and
may_be_aliased returns true?

> Richard.


[Bug rtl-optimization/49429] [4.7 Regression] dse.c changes to fix PR44194 (r175063) cause execution failures

2011-06-16 Thread eraman at google dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49429

--- Comment #8 from Easwaran Raman  2011-06-16 
16:27:44 UTC ---
Compiling x.c with a ia64-unknown-linux cross compiler, setting a breakpoint in
can_escape(), I see that,


(gdb) p debug_rtx (body)
(set (mem/s/c:DI (reg/f:DI 341) [2 s1+0 S8 A64])
(reg:DI 112 in0))

This is part of an instruction that gets removed:
(insn 4 3 6 2 (set (mem/s/c:DI (reg/f:DI 341) [2 s1+0 S8 A64])
(reg:DI 112 in0)) y.c:23 5 {movdi_internal}
 (expr_list:REG_DEAD (reg:DI 112 in0)
(nil)))

(gdb) p expr->base.code 
$24 = PARM_DECL
(gdb) p may_be_aliased (expr)
$23 = 0 '\000'

So can_escape() returns false. But later on, in the same BB, I see:


(insn 36 30 37 2 (set (reg:DI 120 out0)
(reg/f:DI 357)) 5 {movdi_internal}
 (expr_list:REG_EQUAL (plus:DI (reg/f:DI 328 sfp)
(const_int 62 [0x3e]))
(nil)))
(insn 37 36 38 2 (set (reg:DI 121 out1)
(reg/f:DI 341)) 5 {movdi_internal}
 (expr_list:REG_DEAD (reg/f:DI 341)
(expr_list:REG_EQUAL (plus:DI (reg/f:DI 328 sfp)
(const_int 96 [0x60]))
(nil
(insn 38 37 39 2 (set (reg:DI 122 out2)
(const_int 31 [0x1f])) 5 {movdi_internal}
 (nil))
(call_insn 39 38 42 2 (parallel [
(set (reg:DI 8 r8)
(call (mem:DI (symbol_ref:DI ("memcpy") [flags 0x41] 
) [0 memcpy S8 A64])
(const_int 1 [0x1])))
(clobber (reg:DI 320 b0))
(clobber (scratch:DI))
(clobber (scratch:DI))
]) 332 {call_value_gp}
 (expr_list:REG_DEAD (reg:DI 122 out2)
(expr_list:REG_DEAD (reg:DI 121 out1)
(expr_list:REG_DEAD (reg:DI 120 out0)
(expr_list:REG_UNUSED (reg:DI 8 r8)
(expr_list:REG_EH_REGION (const_int 0 [0])
(nil))
(expr_list:REG_DEP_TRUE (use (reg:DI 1 r1))
(expr_list:REG_DEP_TRUE (use (reg:DI 122 out2))
(expr_list:REG_DEP_TRUE (use (reg:DI 121 out1))
(expr_list:REG_DEP_TRUE (use (reg:DI 120 out0))
(nil))

reg 341 is passed as source argument of a memcpy. Why does the expression
return 0 for may_be_aliased()?


[Bug rtl-optimization/49429] dse.c changes to fix PR44194 (r175063) cause execution failures

2011-06-15 Thread eraman at google dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49429

--- Comment #7 from Easwaran Raman  2011-06-16 
00:05:51 UTC ---
>From the dump after the dse.c changes, I see the following for the function
test2_31:


starting to process insn 90
  v:  1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60,
61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80,
81, 82, 83, 84, 85, 86, 87, 88
non-frame wild read
starting to process insn 89
  v:  25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42,
43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62,
63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82,
83, 84, 85, 86, 87, 88

Insn 90 is a call to check31. This is supposed to kill all locations that
escape from the caller, but from the dump it looks like it has only killed some
of them.


[Bug rtl-optimization/49429] dse.c changes to fix PR44194 (r175063) cause execution failures

2011-06-15 Thread eraman at google dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49429

--- Comment #1 from Easwaran Raman  2011-06-15 
22:22:05 UTC ---
 Can you please attach the dse1 dump with and without my patch so that I can
look into it? I will also try to build a IA64 cross compiler and see if I can
spot what's happening, but I don't have access to a ia64 to run the tests.


[Bug rtl-optimization/44194] struct returned by value generates useless stores

2011-06-15 Thread eraman at google dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194

--- Comment #21 from Easwaran Raman  2011-06-15 
20:34:32 UTC ---
The DSE patch still leaves 2 redundant stores. The following patch will enable
DSE to remove those two stores. Does this look ok?



Index: gcc/testsuite/gcc.dg/pr44194-1.c
===
--- gcc/testsuite/gcc.dg/pr44194-1.c(revision 175082)
+++ gcc/testsuite/gcc.dg/pr44194-1.c(working copy)
@@ -13,5 +13,5 @@ void func() {
   struct ints s = foo();
   bar(s.a, s.b);
 }
-/* { dg-final { scan-rtl-dump "global deletions = 2"  "dse1" } } */
+/* { dg-final { scan-rtl-dump "global deletions = 4"  "dse1" } } */
 /* { dg-final { cleanup-rtl-dump "dse1" } } */
Index: gcc/calls.c
===
--- gcc/calls.c(revision 175081)
+++ gcc/calls.c(working copy)
@@ -3005,8 +3005,9 @@ expand_call (tree exp, rtx target, int ignore)
   tree nt = build_qualified_type (rettype,
   (TYPE_QUALS (rettype)
| TYPE_QUAL_CONST));
-
+  tree target_expr = create_tmp_reg (rettype, NULL);
   target = assign_temp (nt, 0, 1, 1);
+  set_mem_expr (target, target_expr);
 }

   if (! rtx_equal_p (target, valreg))


[Bug rtl-optimization/49414] gcc.dg/pr44194-1.c fails

2011-06-15 Thread eraman at google dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49414

--- Comment #2 from Easwaran Raman  2011-06-15 
16:58:57 UTC ---
The DSE opportunity doesn't arise in ia32 since the struct is returned through
stack. Is the following patch restricting the test to x86_64 ok? (I have tested
that it works correctly on x86_64, but don't know how to test it gets excluded
on other platforms)


===
--- gcc/testsuite/gcc.dg/pr44194-1.c(revision 175063)
+++ gcc/testsuite/gcc.dg/pr44194-1.c(working copy)
@@ -1,4 +1,4 @@
-/* { dg-do compile } */
+/* { dg-do compile { target x86_64-*-* } } */
 /* { dg-options "-O2 -fdump-rtl-dse1" } */
 #include 


[Bug rtl-optimization/44194] struct returned by value generates useless stores

2011-04-20 Thread eraman at google dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194

--- Comment #17 from Easwaran Raman  2011-04-21 
00:20:51 UTC ---
On Sun, Apr 17, 2011 at 3:45 AM, rguenther at suse dot de
 wrote:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194
>
> --- Comment #16 from rguenther at suse dot de  
> 2011-04-17 10:44:02 UTC ---
> On Fri, 15 Apr 2011, eraman at google dot com wrote:
>
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194
>>
>> --- Comment #15 from Easwaran Raman  2011-04-15 
>> 22:22:15 UTC ---
>> (In reply to comment #14)
>> > On Fri, 15 Apr 2011, eraman at google dot com wrote:
>> >
>> > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194
>> > >
>> > > Easwaran Raman  changed:
>> > >
>> > >            What    |Removed                     |Added
>> > > ----------------
>> > >                  CC|                            |eraman at google dot com
>> > >
>> > > --- Comment #13 from Easwaran Raman  
>> > > 2011-04-15 19:18:25 UTC ---
>> > > Richard, did you mean to write
>> > >
>> > > static bool
>> > > can_escape (tree expr)
>> > > {
>> > >   tree base;
>> > >   if (!expr)
>> > >     return true;
>> > >   base = get_base_address (expr);
>> > >   if (DECL_P (base)
>> > >       && (!may_be_aliased (base)
>> > >           && !pt_solution_includes (&cfun->gimple_df->escaped, base)))
>> > >     return false;
>> > >   return true;
>> > > }
>> > >
>> > > Only case when we know it doesn't escape is if bas is a DECL_P and is 
>> > > not in
>> > > cfun->gimple_df->escaped and not aliased, right? Actually, I'm wondering 
>> > > if it
>> > > is sufficient to test just
>> > > DECL_P (base) && !pt_solution_includes (&cfun->gimple_df->escaped, base).
>> >
>> > No, because if the escaped solution for example includes ANYTHING then
>> > the test will return true.  That !may-aliased variables are not
>> > contained in ANYTHING isn't known w/o context.
>> >
>> > Richard.
>>
>> Correct me if I am wrong. If I understand you right, just using DECL_P (base)
>> && !pt_solution_includes is conservative since pt_solution_includes may 
>> return
>> true if the escaped solution contains ANYTHING. To make it less conservative,
>> you're suggesting
>>
>>   if (DECL_P (base)
>>       && (!may_be_aliased (base)
>>           || !pt_solution_includes (&cfun->gimple_df->escaped, base)))
>>     return false;
>>
>>  I tried that and most Fortran tests are failing. One of the tests
>> (default_format_1.f90) has the following RTL sequence:
>>
>>
>> (insn 30 29 32 4 (set (mem/s/c:SI (plus:DI (reg/f:DI 20 frame)
>>                 (const_int -608 [0xfda0])) [2
>> dt_parm.0.common.flags+0 S4 A64])
>>         (const_int 16512 [0x4080])) default_format_1.inc:56 64
>> {*movsi_internal}
>>      (nil))
>>
>> (insn 32 30 33 4 (set (reg:DI 5 di)
>>         (reg/f:DI 106)) default_format_1.inc:56 62 {*movdi_internal_rex64}
>>      (expr_list:REG_EQUAL (plus:DI (reg/f:DI 20 frame)
>>             (const_int -608 [0xfda0]))
>>         (nil)))
>>
>> (call_insn 33 32 36 4 (call (mem:QI (symbol_ref:DI ("_gfortran_st_write")
>> [flags 0x41]  ) [0
>> _gfortran_st_write S1 A8])
>>         (const_int 0 [0])) default_format_1.inc:56 618 {*call_0}
>>      (expr_list:REG_DEAD (reg:DI 5 di)
>>         (nil))
>>     (expr_list:REG_DEP_TRUE (use (reg:DI 5 di))
>>         (nil)))
>>
>> For the DECL dt_parm, pt_solution_includes (&cfun->gimple_df->escaped, base)
>> returns false, even though its location is passed as a parameter to
>> _gfortran_st_write.
>>
>> I did test  with
>> if (DECL_P (base)
>>       && (!may_be_aliased (base)
>>           && !pt_solution_includes (&cfun->gimple_df->escaped, base)))
>>
>> which has no regressions. Is that what you suggest?
>
> No, the version with || should be ok.  The dt_parm argument does
> not escape at the _gfortran_st_write call site because this
> intrinsic function has a ".wW" fnspec attribute which specifies
> the arguments do not escape.  What you indeed need to do in
> addition to the escaped solution query is walk over all function
> arguments and see if there is one that aliases 'base'.  That
> may not be easily possible on RTL though.  On the tree level
> we have a separate points-to set for such call clobbers/uses
> but we do not preserve it for RTL.

Is it ok to make calls whose arg(s) have EAF_NOESCAPE kill all
locations off the frame in addition to killing all locations that
potentially escape (using the || case you suggested)? Will it be
better or worse than just checking !may_be_aliased (base) alone?

Thanks,
Easwaran


> --
> Configure bugmail: http://gcc.gnu.org/bugzilla/userprefs.cgi?tab=email
> --- You are receiving this mail because: ---
> You are on the CC list for the bug.
>


[Bug rtl-optimization/44194] struct returned by value generates useless stores

2011-04-15 Thread eraman at google dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194

--- Comment #15 from Easwaran Raman  2011-04-15 
22:22:15 UTC ---
(In reply to comment #14)
> On Fri, 15 Apr 2011, eraman at google dot com wrote:
> 
> > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194
> > 
> > Easwaran Raman  changed:
> > 
> >What|Removed |Added
> > 
> >          CC|        |eraman at google dot com
> > 
> > --- Comment #13 from Easwaran Raman  2011-04-15 
> > 19:18:25 UTC ---
> > Richard, did you mean to write
> > 
> > static bool
> > can_escape (tree expr)
> > {
> >   tree base;
> >   if (!expr)
> > return true;
> >   base = get_base_address (expr);
> >   if (DECL_P (base)
> >   && (!may_be_aliased (base)
> >   && !pt_solution_includes (&cfun->gimple_df->escaped, base)))
> > return false;
> >   return true;
> > }
> > 
> > Only case when we know it doesn't escape is if bas is a DECL_P and is not in
> > cfun->gimple_df->escaped and not aliased, right? Actually, I'm wondering if 
> > it
> > is sufficient to test just
> > DECL_P (base) && !pt_solution_includes (&cfun->gimple_df->escaped, base).
> 
> No, because if the escaped solution for example includes ANYTHING then
> the test will return true.  That !may-aliased variables are not
> contained in ANYTHING isn't known w/o context.
> 
> Richard.

Correct me if I am wrong. If I understand you right, just using DECL_P (base)
&& !pt_solution_includes is conservative since pt_solution_includes may return
true if the escaped solution contains ANYTHING. To make it less conservative,
you're suggesting

  if (DECL_P (base)
  && (!may_be_aliased (base)
  || !pt_solution_includes (&cfun->gimple_df->escaped, base)))
return false;

 I tried that and most Fortran tests are failing. One of the tests
(default_format_1.f90) has the following RTL sequence:


(insn 30 29 32 4 (set (mem/s/c:SI (plus:DI (reg/f:DI 20 frame)
(const_int -608 [0xfda0])) [2
dt_parm.0.common.flags+0 S4 A64])
(const_int 16512 [0x4080])) default_format_1.inc:56 64
{*movsi_internal}
 (nil))

(insn 32 30 33 4 (set (reg:DI 5 di)
(reg/f:DI 106)) default_format_1.inc:56 62 {*movdi_internal_rex64}
 (expr_list:REG_EQUAL (plus:DI (reg/f:DI 20 frame)
(const_int -608 [0xfda0]))
(nil)))

(call_insn 33 32 36 4 (call (mem:QI (symbol_ref:DI ("_gfortran_st_write")
[flags 0x41]  ) [0
_gfortran_st_write S1 A8])
(const_int 0 [0])) default_format_1.inc:56 618 {*call_0}
 (expr_list:REG_DEAD (reg:DI 5 di)
(nil))
(expr_list:REG_DEP_TRUE (use (reg:DI 5 di))
(nil)))

For the DECL dt_parm, pt_solution_includes (&cfun->gimple_df->escaped, base)
returns false, even though its location is passed as a parameter to
_gfortran_st_write.

I did test  with 
if (DECL_P (base)
  && (!may_be_aliased (base)
  && !pt_solution_includes (&cfun->gimple_df->escaped, base)))

which has no regressions. Is that what you suggest?


[Bug rtl-optimization/44194] struct returned by value generates useless stores

2011-04-15 Thread eraman at google dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194

Easwaran Raman  changed:

   What|Removed |Added

 CC||eraman at google dot com

--- Comment #13 from Easwaran Raman  2011-04-15 
19:18:25 UTC ---
Richard, did you mean to write

static bool
can_escape (tree expr)
{
  tree base;
  if (!expr)
return true;
  base = get_base_address (expr);
  if (DECL_P (base)
  && (!may_be_aliased (base)
  && !pt_solution_includes (&cfun->gimple_df->escaped, base)))
return false;
  return true;
}

Only case when we know it doesn't escape is if bas is a DECL_P and is not in
cfun->gimple_df->escaped and not aliased, right? Actually, I'm wondering if it
is sufficient to test just
DECL_P (base) && !pt_solution_includes (&cfun->gimple_df->escaped, base).


[Bug rtl-optimization/44194] struct returned by value generates useless stores

2011-04-14 Thread eraman at google dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194

Easwaran Raman  changed:

   What|Removed |Added

  Attachment #23968|0   |1
is obsolete||

--- Comment #10 from Easwaran Raman  2011-04-14 
18:59:26 UTC ---
Created attachment 23987
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23987
Fixes a bug in the previous patch


[Bug rtl-optimization/48583] Mismatch between CFG and IR after cfglayout

2011-04-12 Thread eraman at google dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48583

--- Comment #3 from Easwaran Raman  2011-04-13 
00:18:38 UTC ---
Sorry for the noise. I have a patch to DSE that fails with nrv5.C and I thought
this is somehow causing it.


[Bug rtl-optimization/48583] New: Mismatch between CFG and IR after cfglayout

2011-04-12 Thread eraman at google dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48583

   Summary: Mismatch between CFG and IR after cfglayout
   Product: gcc
   Version: 4.7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: era...@google.com


Created attachment 23970
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23970
RTL dump after into_cfglayout pass.

With gcc built from trunk revision 20110404,
$ trunk_gcc  -O2  gcc/testsuite/g++.dg/opt/nrv5.C  -fdump-rtl-all-all

In nrv5.C.149r.into_cfglayout, function 'void test(bool)' (_Z4testb) has the
following snippet:

;; Start of basic block ( 2) -> 4
;; Pred edge  2 [39.0%]  (fallthru)
(note 9 8 11 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
(many RTL instructions)
(insn 37 36 40 4 (set (mem/s/j/c:QI (plus:DI (reg/f:DI 20 frame)
(const_int -16 [0xfff0])) [0+0 S1 A64])
(reg:QI 82)) nrv5.C:39 66 {*movqi_internal}
 (nil))
;; End of basic block 4 -> ( 6)

;; Succ edge  6 [100.0%]  (fallthru)

;; Start of basic block ( 2) -> 5
;; Pred edge  2 [61.0%] 
(code_label 40 37 41 5 21 "" [1 uses])

(note 41 40 43 5 [bb 5] NOTE_INSN_BASIC_BLOCK)


The last instruction in bb 4 is insn 37, which is not a jump. The successor
edge of bb 5 is 6 and is labeled a fallthru. But note that bb 5, and not bb 6,
follows insn 37.

In nrv5.C.148r.vregs, insn 37 is followed by a jump_insn which jumps to bb 6. 

The problem seems to be in try_redirect_by_replacing_jump in cfgrtl.c when
called from try_optimize_cfg in cfgcleanup.c. That seems to delete the jump
even though the successor edge is not a fallthru.


[Bug rtl-optimization/44194] struct returned by value generates useless stores

2011-04-12 Thread eraman at google dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194

--- Comment #9 from Easwaran Raman  2011-04-12 
22:39:23 UTC ---
Created attachment 23968
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23968
Patch to dse.c to be less conservative with calls.

Currently dse kills all stores on a call since call can do a wild read. But
calls can not read off frame unless it is a local variable that can escape.
This patch ensures frame based stores are not killed on a call if they can't
escape. For the first struct return case, this removes the redundant stores to
the local variable. Does this look reasonable?


[Bug target/44575] [4.5 Regression] __builtin_va_arg overwrites into adjacent stack location

2010-09-29 Thread eraman at google dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44575

--- Comment #7 from Easwaran Raman  2010-09-30 
00:21:17 UTC ---
This is a variation of the same problem where __builtin_va_arg overwrites into
adjacent stack location [Not sure if I should reopen this bug or file a new
one]:

$ cat vararg.cc

#include 
#include 
struct S933 { struct{struct{}b[6];union{}c[7];}a;char d;char e; };

struct S933 arg;
void check933va (int z, ...) {
  char c;
  va_list ap;
  __builtin_va_start(ap,z);
  c = 'a';
  arg = __builtin_va_arg(ap,struct S933);
  if (c != 'a')
abort();

}
int main() {
  struct S933 s933;
  check933va (1, s933);
}

$ ./trunk-g++  -O0  vararg.cc && ./a.out
Aborted

./trunk-g++ is GNU C++  version 4.6.0 20100924 (experimental)
(x86_64-unknown-linux-gnu)

The relevant portion of the gimple is below:
  D.2773_4 = ap.reg_save_area;
  D.2774_5 = ap.gp_offset;
  D.2775_6 = (long unsigned int) D.2774_5;
  int_addr.1_7 = D.2773_4 + D.2775_6;
  addr.0_8 = &va_arg_tmp.3;
  D.2777_9 = addr.0_8 + 8;
  D.2778_10 = MEM[(long unsigned int *)int_addr.1_7];
  *D.2777_9 = D.2778_10;<--- Bad move

The move to address D.2777_9 is the problem

For this struct type, construct_container returns the following:

(parallel:BLK [
(expr_list:REG_DEP_TRUE (reg:DI 0 ax)
(const_int 8 [0x8]))
])

The destination of the move is at offset 8 (INTVAL (XEXP (slot, 1))) of the
temporary created. The size of the temp (sizeof(S933)) is 15 bytes and the move
is in DI mode. I think the problem is the check  if (prev_size + cur_size >
size) doesn't really check if the destination is overwritten.


[Bug target/44575] New: __builtin_va_arg overwrites into adjacent stack location

2010-06-17 Thread eraman at google dot com
$ cat vararg.c
#include 
#include 
#include 

int fails = 0;
struct S116 { float a[3]; } ;
struct S116 a116[5];

void check116va (int z, ...)
{ struct S116 arg, *p;
  va_list ap;
  int j=0,k=0;
  int i;
  __builtin_va_start(ap,z);
  for (i = 2; i < 4; ++i) {
p = NULL;
j++;
k+=2;
switch ((z << 4) | i) {
  case 0x12: case 0x13: p = &a116[2]; arg = __builtin_va_arg(ap,struct
S116); break;
  default: ++fails; break;
}
if (p && p->a[2] != arg.a[2]) {
  ++fails;
}
if (fails)
  break;
  }
  __builtin_va_end(ap);
}
int main()
{
  memset (a116, '\0', sizeof (a116));
  a116[2].a[2] = -49026.625000;
  check116va (1, a116[2], a116[2]);
  if (fails)
abort();
}

$ ./trunk-gcc -O0  vararg.c && ./a.out
Aborted

./trunk-gcc is gcc 4.6.0  configured with --target=x86_64-unknown-linux-gnu
--disable-nls --enable-threads=posix --enable-symvers=gnu --enable-__cxa_atexit
--enable-c99 --enable-long-long --with-gnu-as --with-gnu-ld
--build=x86_64-unknown-linux-gnu --host=x86_64-unknown-linux-gnu
--enable-checking=release --enable-multilib --enable-targets=all
--with-arch-32=pentium3 --with-tune-32=pentium4 
--enable-shared=libgcc,libmudflap,libssp,libstdc++,libgfortran
--with-pic=libgfortran --enable-languages=c,c++,fortran 
--with-native-system-header-dir=/include  --enable-linker-build-id 
--with-host-libstdcxx=-lstdc++ FCFLAGS='-g -O2 ' 

The test cases passes with gcc 4.2.4 and 4.4.3.  

The gimple for __builtin_va_arg (from vararg.c.004t.gimple ) contains

  addr.1 = &va_arg_tmp.4;
  addr.5 = (long unsigned int * {ref-all}) addr.1;
  sse_addr.6 = (long unsigned int *) sse_addr.3;
  D.3520 = *sse_addr.6;
  *addr.5 = D.3520;  ---> (1)  
  addr.7 = (long unsigned int * {ref-all}) addr.1;
  D.3522 = addr.7 + 8;
  sse_addr.8 = (long unsigned int *) sse_addr.3;
  D.3524 = sse_addr.8 + 16;
  D.3525 = *D.3524;
  *D.3522 = D.3525; ---> (2)

The assignments  (1) and (2) above are 8 byte moves, one at offset 0 and
another at offset 8, into va_arg_tmp.4. But the size of va_arg_tmp.4 is 12
bytes (sizeof (struct S116)) resulting in overwriting of adjacent stack
location ( variable i in this case) leading to the failure.


-- 
   Summary: __builtin_va_arg overwrites into adjacent stack location
   Product: gcc
   Version: 4.6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: eraman at google dot com
 GCC build triplet: x86_64-unknown-linux-gnu
  GCC host triplet: x86_64-unknown-linux-gnu
GCC target triplet: x86_64-unknown-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44575