[Bug middle-end/45422] [4.6 Regression] compile time increases 3x.

2010-08-31 Thread davidxl at gcc dot gnu dot org


--- Comment #26 from davidxl at gcc dot gnu dot org  2010-08-31 17:45 
---
Good observation re. the number of IVs in the final set. This usually points to
some problem/bug in the cost function. I briefly looked at this case -- it
indeed exposes two more bugs in the cost model:

1) the computation cost of the all the cost pairs in an assignment can actually
not simply be added together, because many rewrite expressions can be commoned.
We now have the mechanism to compute with common loop invariants for register
pressure estimation, and this mechnasim needs to be extended for computation
cost.

2) the offset is not stripped when computing loop invariant expression ids --
this can cause problem in overestimating reg pressure. (The case arises more
often with loop unrolling).

David


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422



[Bug middle-end/45422] [4.6 Regression] compile time increases 3x.

2010-08-30 Thread davidxl at gcc dot gnu dot org


--- Comment #25 from davidxl at gcc dot gnu dot org  2010-08-30 16:41 
---
(In reply to comment #24)
 (In reply to comment #20)
  (In reply to comment #16)
   adjust summary according to the last timings
   
  
  I am surprised to see such big differences between trunk and previous 
  releases.
  Compiling this test case with the those options on my core2 box (2.4GHz ) 
  took
  only 56seconds which is comparable with the timing with a 4.4.3 compiler 
  (with
  google local patches including ivopt improvements).
 
 Of course - because the ivopt improvement patches are the problem.
 

It is just the total time diff from Joost's measure can be just explained by
ivopt component.

David


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422



[Bug middle-end/45422] [4.6 Regression] compile time increases 3x.

2010-08-29 Thread davidxl at gcc dot gnu dot org


--- Comment #20 from davidxl at gcc dot gnu dot org  2010-08-30 03:10 
---
(In reply to comment #16)
 adjust summary according to the last timings
 

I am surprised to see such big differences between trunk and previous releases.
Compiling this test case with the those options on my core2 box (2.4GHz ) took
only 56seconds which is comparable with the timing with a 4.4.3 compiler (with
google local patches including ivopt improvements).

David


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422



[Bug middle-end/45422] [4.6 Regression] compile time increases 3x.

2010-08-29 Thread davidxl at gcc dot gnu dot org


--- Comment #21 from davidxl at gcc dot gnu dot org  2010-08-30 03:19 
---
(In reply to comment #17)
  tree iv optimization  :  32.57 (20%) usr   0.10 ( 5%) sys  32.73 (20%) wall 
 322095 kB (18%) ggc
 
 
 20% is still completely unreasonable for IV optimization.
 

There was a patch in trunk that may double the time in ivopt -- i.e.
find_optimal_iv_set_1 is done twice, one with the original iv set while the
other with full set. This probably needs to be revisited. 

David


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422



[Bug middle-end/45422] [4.6 Regression] compile time increases 8x.

2010-08-28 Thread davidxl at gcc dot gnu dot org


--- Comment #10 from davidxl at gcc dot gnu dot org  2010-08-28 06:00 
---
fixed in r163610.


-- 

davidxl at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422



[Bug middle-end/45422] [4.6 Regression] compile time increases 8x.

2010-08-27 Thread davidxl at gcc dot gnu dot org


--- Comment #9 from davidxl at gcc dot gnu dot org  2010-08-27 17:01 ---
Will take a look


-- 

davidxl at gcc dot gnu dot org changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |davidxl at gcc dot gnu dot
   |dot org |org
 Status|NEW |ASSIGNED
   Last reconfirmed|2010-08-27 10:23:21 |2010-08-27 17:01:01
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422



[Bug middle-end/45098] Missed induction variable optimization

2010-07-30 Thread davidxl at gcc dot gnu dot org


--- Comment #1 from davidxl at gcc dot gnu dot org  2010-07-30 17:23 ---
Seems -Os specific -- also reproducible on x86. With -O2, the result is
expected.

David


-- 

davidxl at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||davidxl at gcc dot gnu dot
   ||org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45098



[Bug c++/45121] [4.6 Regression] c-c++-common/uninit-17.c

2010-07-29 Thread davidxl at gcc dot gnu dot org


--- Comment #3 from davidxl at gcc dot gnu dot org  2010-07-29 17:21 ---
Fixed in r162687


-- 

davidxl at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45121



[Bug c++/45121] [4.6 Regression] c-c++-common/uninit-17.c

2010-07-28 Thread davidxl at gcc dot gnu dot org


--- Comment #2 from davidxl at gcc dot gnu dot org  2010-07-29 05:51 ---
The problem is that before the ivopt patch, the ivopt patch introduced a iv
candidate that is unconditionally initialized with b:

  ivtmp_xxx = b (D);

After the patch, this assignment no longer exists, and the use of b in the test
is via a PHI def -- thus the warning becomes 'may be uninitialized'.

Will fix the test case.

David


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45121



[Bug testsuite/44932] gcc.dg/uninit-pred-9_b.c fails

2010-07-19 Thread davidxl at gcc dot gnu dot org


--- Comment #4 from davidxl at gcc dot gnu dot org  2010-07-19 16:34 ---
Fixed in r162310.

David


-- 

davidxl at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44932



[Bug testsuite/44932] gcc.dg/uninit-pred-9_b.c fails

2010-07-13 Thread davidxl at gcc dot gnu dot org


--- Comment #1 from davidxl at gcc dot gnu dot org  2010-07-14 04:12 ---

This seems to be specific to powerpc.

Could you attach the dump files with options:

-O2 -Wuninitialized -fdump-tree-cddce2 -fdump-tree-uninit-details

Thanks,

David


(In reply to comment #0)
 Subject testcase fails on powerpc64.
 
 FAIL: gcc.dg/uninit-pred-9_b.c bogus warning (test for bogus messages, line 
 24)
 
 
 Compiling standalone I see the following:
 
 pthaugen/work ~/install/gcc/trunk/bin/gcc -O2 -S -m32 -Wuninitialized
 ~/src/gcc/trunk/gcc/gcc/testsuite/gcc.dg/uninit-pred-9_b.c
 /home/pthaugen/src/gcc/trunk/gcc/gcc/testsuite/gcc.dg/uninit-pred-9_b.c: In
 function 'foo':
 /home/pthaugen/src/gcc/trunk/gcc/gcc/testsuite/gcc.dg/uninit-pred-9_b.c:24:11:
 warning: 'v' may be used uninitialized in this function [-Wuninitialized]
 /home/pthaugen/src/gcc/trunk/gcc/gcc/testsuite/gcc.dg/uninit-pred-9_b.c: In
 function 'foo_2':
 /home/pthaugen/src/gcc/trunk/gcc/gcc/testsuite/gcc.dg/uninit-pred-9_b.c:41:11:
 warning: 'v' may be used uninitialized in this function [-Wuninitialized]
 


-- 

davidxl at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||xinliangli at gmail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44932



[Bug tree-optimization/43846] [4.5 Regression] array vs members, total scalarization issues

2010-04-22 Thread davidxl at gcc dot gnu dot org


--- Comment #3 from davidxl at gcc dot gnu dot org  2010-04-22 17:04 ---
(In reply to comment #2)
 (In reply to comment #1)
  
  so it doesn't consider the struct with the array for total scalarization
  for some reason.  Martin?
  
 
 Well, that was a deliberate decision when fixing PR 42585 (see
 type_consists_of_records_p).  The code is simpler because it does not
 have to know how to iterate over the array index domain.
 
 Of course, we can alleviate this restriction and learn how to iterate.
 However, all the accesses for the whole array are already created,
 that is not the issue.  The problem basically is that when we see the
 sequence
 
   D.2035.m[0] = D.2044_20;
   D.2035.m[1] = D.2043_19;
   D.2035.m[2] = D.2042_18;
   *b_1(D) = D.2035;
 
 (and there are no other accesses to D.2035) the condition that tries
 to prevent us from creating unnecessary replacements kicks in and we
 decide not to scalarize. 

This code sequence looks like a good motivating factor for
scalarizing/expansion. In fact, small arrays should be treated the same way as
records if all accesses are through compile time constant indices. This is a
common scenario after full unrolling. 

 The intent of the current code (possibly
 among other reasons) was to avoid going through a replacement when the
 whole structure was then passed as an argument to a function and
 similar situations. 

If the temp aggregate is passed to call and the calling convention is not
exposed at the IL level, then it is not a good sra candidate as no copy (both
code and storage) elimination will be exposed. In this one, the temp aggregate
is used as the RHS of an assignment, thus it is a good candidate to expand. So
will be the reverse case:

aggregate1 = aggregate2;
 ..
... = aggregate1.e1;
... = aggregate1.e2;

David

 But it should not be very difficult to change the
 condition (in analyze_access_subtree) to handle both situations right.
 
 Doing this, rather than total scalarization for arrays (which should
 be only useful as a substitute for a copy propagation) should enable
 us to handle even huge arrays.
 
 I'll get to this right after dealing with PR 43835.
 


-- 

davidxl at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||xinliangli at gmail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43846



[Bug middle-end/36550] Wrong may be used uninitialized warning (conditional PHIs)

2010-04-20 Thread davidxl at gcc dot gnu dot org


--- Comment #11 from davidxl at gcc dot gnu dot org  2010-04-20 23:55 
---
(In reply to comment #2)
 (In reply to comment #1)
  check() can return 1 on the first call and 0 on the second and if *argv is 
  NULL
  then then bug will be used uninitialized.
 
 right, but this doesn't matter here. Better testcase:
 
 /* { dg-do compile } */
 /* { dg-options -Os -Wuninitialized } */
 void bail(void) __attribute__((noreturn));
 unsigned once(void);
 int pr(char**argv)
 {
 char *bug;
 unsigned check = once();
 if (check) {
 if (*argv)
 bug = *++argv;
 } else {
 bug = *argv++;
 if (!*argv)
 bail();
 }
 /* now bug is set except if (check  !*argv) */
 if (check) {
 if (!*argv)
 return 0;
 }
 /* if we ever get here then bug is set */
 return *bug != 'X';
 }
 


The example is a little tricky for the compiler to reason because of the
'++argv'. Predicate analysis
(http://gcc.gnu.org/ml/gcc-patches/2010-04/msg00706.html -- with additional fix
to a never return handling) will catch the following case (while the trunk gcc
does not):


void bail(void) __attribute__((noreturn));
int foo(void);
unsigned once(void);
int pr(char**argv)
{
char *bug;
unsigned check = once();
char * a = *argv;
if (check) {
if (a)
bug = *++argv;
} else {
bug = *argv++;
if (!*argv)
bail();
}

if (foo ())
  once();

/* now bug is set except if (check  !*argv) */
if (check) {
if (!a || !*argv)
return 0;
}
/* if we ever get here then bug is set */
return *bug != 'X';
}


-- 

davidxl at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||davidxl at gcc dot gnu dot
   ||org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36550



[Bug middle-end/20968] spurious may be used uninitialized warning (conditional PHIs)

2010-04-20 Thread davidxl at gcc dot gnu dot org


--- Comment #8 from davidxl at gcc dot gnu dot org  2010-04-21 00:27 ---
(In reply to comment #2)
 Note this is not fully a regression but really a progression.
 What is happening now is only partial optimizations is happen before the 
 warning to happen.
 
 I was unable to reduce the test case further without making the warning
 disappear.  In particular, removing the increment of v1-count makes the 
 warning
 disappear.
 This is because we would then jump thread he jump.
 
 Again this is because we are emitting the warning too soon, I might be able 
 to come up with a testcase 
 which shows that this is not really a regression but a progression in that we 
 have warned in 3.4 and 
 4.0:
 struct {int count;} *v1;
 int c;
 int k;
 
 extern void baz(int);
 void foo(void)
 {
 int i;
 int r;
 if (k == 4)
 {
 i = 1;
 r = 1;
 }
 else
 r = 0;
 
 if (!r)
 {
 if (!c)
 return;
 v1-count++;
 }
 if (!c)
 {
 baz(i);
 }
 }
 
 There is no different from the case above and the functions you gave below.
 
 There has been some talking about moving where we warn about uninitialized 
 variables but I feel that 
 you can get around this in your code.

To reproduce the problem -- -fno-tree-vrp  -fno-tree-dominator-opts
-fno-tree-ccp are needed. This 


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20968



[Bug c/42643] may be used uninitialized compiled with -Wall -O

2010-04-20 Thread davidxl at gcc dot gnu dot org


--- Comment #1 from davidxl at gcc dot gnu dot org  2010-04-21 00:29 ---
(In reply to comment #0)
 When compiling the source with -Wall -O, gcc gives the following warning:
 
 % gcc -c -Wall -O gcc_test.c
 gcc_test.c: In function ?functionLeon?:
 gcc_test.c:11: warning: ?reference? may be used uninitialized in this function
 
 % cat gcc_test.c
 #includestdio.h
 
 typedef struct {
   int yb;
 } TCRData;
 
 void functionLeon (TCRData *pParent, int pBool);
 void functionLeon (TCRData *pParent, int pBool)
 {
 int isRootCell;
 TCRData *reference;
 
 isRootCell = (pParent == NULL);
 
 if (!isRootCell)
 reference = pParent;
 
 if (pBool) {
if(!isRootCell)
   reference-yb++;
 }
 }
 
 % gcc -v
 Using built-in specs.
 Target: x86_64-redhat-linux
 Configured with: ../../src/gcc-4.4.0/configure
 --prefix=/remote/depotsrc/depotsrc/amd64-2.4/local_install/gcc-4.4.0
 --enable-bootstrap --enable-shared --enable-threads=posix --disable-checking
 -with-gmp=/remote/depotsrc/depotsrc/amd64-2.4/local_install/gmp-4.3.1
 --with-mpfr=/remote/depotsrc/depotsrc/amd64-2.4/local_install/mpfr-2.4.1
 --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions
 --enable-languages=c,c++,fortran --with-cpu=generic 
 --build=x86_64-redhat-linux
 Thread model: posix
 gcc version 4.4.0 (GCC)
 

This is a common case handled by patch in

http://gcc.gnu.org/ml/gcc-patches/2010-04/msg00706.html


-- 

davidxl at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||davidxl at gcc dot gnu dot
   ||org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42643



[Bug middle-end/35560] Missing CSE/PRE for memory operations involved in virtual call.

2010-02-03 Thread davidxl at gcc dot gnu dot org


--- Comment #6 from davidxl at gcc dot gnu dot org  2010-02-03 18:30 ---
See discussions in http://gcc.gnu.org/ml/gcc-patches/2010-02/msg00138.html
about changing dynamic types using placement new -- it is basically not allowed
-- so the optimization is valid.


David


-- 

davidxl at gcc dot gnu dot org changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |davidxl at gcc dot gnu dot
   |dot org |org
 Status|NEW |ASSIGNED
   Last reconfirmed|2008-11-29 22:42:58 |2010-02-03 18:30:00
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35560



[Bug middle-end/35560] Missing CSE/PRE for memory operations involved in virtual call.

2010-02-03 Thread davidxl at gcc dot gnu dot org


--- Comment #8 from davidxl at gcc dot gnu dot org  2010-02-03 21:44 ---
(In reply to comment #7)
 It is valid to use placement new to construct a more or less derived type
 which would change the vtable pointer.
 
 Thus I think this bug is still invalid.
 

How did you reach this conclusion from reading p7 of 3.8 in the standard?
The original object was a most derived object of type T and the new object is
a most derived object of type T

The following is allowed:
class B {
 virtual ...
};

class D : public B {
  ...
};

B* bp = new D ();
...

new (bp) D();

but vptr does not change.


Set aside the standard -- this optimization is useful regardless. Some of the
develpoers are so desperate that they manually do LICM of vptr and vtbl access
for vcalls in the loop.  The worst case is to use a option to guard it (which I
think the default should be on).

David


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35560



[Bug middle-end/35560] Missing CSE/PRE for memory operations involved in virtual call.

2010-02-03 Thread davidxl at gcc dot gnu dot org


--- Comment #11 from davidxl at gcc dot gnu dot org  2010-02-03 21:55 
---
(In reply to comment #9)
 Ah, Set aside the standard. Another user who wants to make up his own
 semantics for a standardized language. No, no, and damn no.
 


Of course, things like this can be brought up to the language committee as long
as it is 1) not ambiguous 2) and generally useful.

(In terms of optimization related semantics (type aliasing, restrict etc), I am
not sure how standard it actually is given the ambiguity here and there.)

David


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35560



[Bug middle-end/35560] Missing CSE/PRE for memory operations involved in virtual call.

2010-02-03 Thread davidxl at gcc dot gnu dot org


--- Comment #13 from davidxl at gcc dot gnu dot org  2010-02-03 22:05 
---
(In reply to comment #12)
 Btw, a destructor call also changes the vtbl pointer.
 

ctors, dtors, wrapper function calls etc are all handled. Detailed write up
will be available at some point. To put it a simple way, it is done via live
across analyis: if an poly object is referenced before and after a call
(accesses to any field of it) both available and anticipated from a a call --
it is live across the call -- vptr field won't be modified by the call. 
Partially anticipated case is also handled.  Once vptr is handled, vtbl access
follows automatically -- at vtbls are RO. vptr assignment is treated
conservatively.

I implemented this thing in 4.4 line using special shadow symbols and
VUSE/VDEFS. It works as expected except that SCCVN time went to hell. Simple
fix to collapse varying defs in DFS walk help a lot but still slow. Need to do
this using alias oracle.

David


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35560



[Bug target/40956] GCSE opportunity in if statement

2009-12-23 Thread davidxl at gcc dot gnu dot org


--- Comment #3 from davidxl at gcc dot gnu dot org  2009-12-23 19:37 ---
This bug is ARM specific (thumb) mode. In x86, the hoisting is unnecessary as
the move instruction support the imm form. 

The issue here is more in the GIMPLE canonicalization (target specific). In
this case, the IR should be in the following form to expose the hoisting.

if (...) {
   temp = 0;
   *p = temp;
}
else
{
   temp = 0;
   *(p+1) = temp;
}


-- 

davidxl at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|RESOLVED|UNCONFIRMED
 Resolution|DUPLICATE   |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40956



[Bug tree-optimization/42337] GCC ICE in compute_antic, at tree-ssa-pre.c:2534

2009-12-09 Thread davidxl at gcc dot gnu dot org


--- Comment #2 from davidxl at gcc dot gnu dot org  2009-12-09 18:07 ---
Fixed in r155111.


-- 

davidxl at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42337



[Bug tree-optimization/39557] Invalid PDOM lead to infinite loop to be generated

2009-03-27 Thread davidxl at gcc dot gnu dot org


--- Comment #2 from davidxl at gcc dot gnu dot org  2009-03-27 18:25 ---
See SVN revision 145121 for the fix.


-- 

davidxl at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39557



[Bug tree-optimization/39548] gcc ICE compiling code with option -fprofile-generate

2009-03-27 Thread davidxl at gcc dot gnu dot org


--- Comment #8 from davidxl at gcc dot gnu dot org  2009-03-27 18:28 ---
See r145118 for the fix.


-- 

davidxl at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39548



[Bug tree-optimization/39557] New: Invalid PDOM lead to infinite loop to be generated

2009-03-25 Thread davidxl at gcc dot gnu dot org
Compiling the attached source with the following options

 -Wall -fno-exceptions -O2 -fprofile-use=/blah  -fno-rtti 

will result in a code with infinite loop.

In DCE, special code is added to handle dead loops conservatively. However this
requires PDOM information (control dep info) to be valid. The PDOM is created
in unintialized variable warning, but gets invalidated before cddce pass (the
incremental update does not work well). With the wrong CD info, DCE pass tries
to eliminate the loop, but the exit edge fixup code ends up linking the
precessor not to its post-dom bb, but to itself -- leading to infinite loop.

A proposed patch will be posted to gcc-patches.

David


-- 
   Summary: Invalid PDOM lead to infinite loop to be generated
   Product: gcc
   Version: 4.4.0
Status: UNCONFIRMED
  Keywords: wrong-code
  Severity: normal
  Priority: P3
 Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: davidxl at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39557



[Bug tree-optimization/39557] Invalid PDOM lead to infinite loop to be generated

2009-03-25 Thread davidxl at gcc dot gnu dot org


--- Comment #1 from davidxl at gcc dot gnu dot org  2009-03-25 23:10 ---
Created an attachment (id=17542)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17542action=view)
test case


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39557



[Bug tree-optimization/39548] gcc ICE compiling code with option -fprofile-generate

2009-03-24 Thread davidxl at gcc dot gnu dot org


--- Comment #1 from davidxl at gcc dot gnu dot org  2009-03-24 17:50 ---
Created an attachment (id=17538)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17538action=view)
Test case


-- 

davidxl at gcc dot gnu dot org changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |davidxl at gcc dot gnu dot
   |dot org |org
 Status|UNCONFIRMED |ASSIGNED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39548



[Bug tree-optimization/39548] gcc ICE compiling code with option -fprofile-generate

2009-03-24 Thread davidxl at gcc dot gnu dot org


--- Comment #2 from davidxl at gcc dot gnu dot org  2009-03-24 17:51 ---
Created an attachment (id=17539)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17539action=view)
patch file


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39548



[Bug tree-optimization/39548] gcc ICE compiling code with option -fprofile-generate

2009-03-24 Thread davidxl at gcc dot gnu dot org


--- Comment #5 from davidxl at gcc dot gnu dot org  2009-03-24 21:25 ---
(In reply to comment #3)
 It might be better to place the check after the loop (and put an assert in
 set_copy_of_val that triggers the copy may not happen).
 

This sounds good.

David


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39548



[Bug tree-optimization/39548] gcc ICE compiling code with option -fprofile-generate

2009-03-24 Thread davidxl at gcc dot gnu dot org


--- Comment #6 from davidxl at gcc dot gnu dot org  2009-03-24 21:33 ---
(In reply to comment #4)
 Btw, it shouldn't really happen that we are not allowed to copyprop PHI
 arguments.  It hints at some inconsistency in the IL instead.
 

This sounds good.

David(In reply to comment #4)
 Btw, it shouldn't really happen that we are not allowed to copyprop PHI
 arguments.  It hints at some inconsistency in the IL instead.
 

Yes I suspect that too, but this is an independent issue. As long as the check
is done in replace_uses_in (tree-ssa-propagate), it should be done in the copy
chain computation -- at least it should be done in line 742 of tree-ssa-copy.c
(copy_prop_visit_cond_stmt), which was my original fix.

By the way, the check that fails in may_propagate_copy is -- which looks hairy.
If you think it is ok, I can file a different bug to track this.



  else if (!MTAG_P (SSA_NAME_VAR (dest))
!MTAG_P (SSA_NAME_VAR (orig))
(DECL_NO_TBAA_P (SSA_NAME_VAR (dest))
   != DECL_NO_TBAA_P (SSA_NAME_VAR (orig


Thanks,

David


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39548



[Bug middle-end/38204] PRE for post dominating expressions

2008-11-21 Thread davidxl at gcc dot gnu dot org


--- Comment #3 from davidxl at gcc dot gnu dot org  2008-11-22 00:35 ---
(In reply to comment #2)
 (In reply to comment #0)
  For this function:
  int test (int a, int b, int c, int g)
  {
int d, e;
if (a)
  d = b * c;
else
  d = b - c;
e = b * c + g;
return d + e;
  }
  
  the multiply expression is moved to both branches of the if, it would be
  better to move it before the if.  Intel's compiler does that.
  
 
 Moving it before the if is a code size optimization that also happens to 
 extend
 the lifetime of the multiply.
 So better is a relative term.
 

As a side note: 

PRE is made aware of the impact of code size bloat and is -Os friendly.  for
instance, if multiple insertions are needed, the PRE won't happen with -Os.

if (..)
   expr
else if (..)
   ...
else if (..)
   ...
else
   ...

expr

While this is good, if hoisting opportunities exposed by PRE is materialized,
this PRE should still be allowed under -Os.

(-Os in gcc is not well tuned -- many optimizations are simply turned off in
fear of code bloat without analysis -- the end result is often lost
opportunities for code clean up -- end up with a slower and BIGGer binary).

The hoisting increase tmp life time slightly, but it also adds more scheduling
freedom as a good effect.

David 


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38204



[Bug rtl-optimization/36438] New: gcc ICE compiling code with mmx builtin

2008-06-05 Thread davidxl at gcc dot gnu dot org
Compiling the following code with latest compiler, got ice:

f.i: In function 'void foo(int __vector__*, int)':
f.i:33: internal compiler error: in trunc_int_for_mode, at explow.c:55
Please submit a full bug report,
with preprocessed source if appropriate.
See http://gcc.gnu.org/bugs.html for instructions.


// f.i:
typedef unsigned short int16;
typedef int __m64 __attribute__ ((__vector_size__ (8), __may_alias__));
typedef long long __v1di __attribute__ ((__vector_size__ (8)));

extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__,
__artificial__))
_mm_slli_si64 (__m64 __m, int __count)
{
  return (__m64) __builtin_ia32_psllqi ((__v1di)__m, __count);
}

extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__,
__artificial__))
_mm_set_pi16 (short __w3, short __w2, short __w1, short __w0)
{
  return (__m64) __builtin_ia32_vec_init_v4hi (__w0, __w1, __w2, __w3);
}

inline __m64 __attribute__((__always_inline__)) SetS16(int16 a, int16 b, int16
c, int16 d) {
  return _mm_set_pi16(d, c, b, a);
}

void foo(__m64* dest, int n) {

  __m64 mask = SetS16(0x00FF, 0xFF00, 0x, 0x00FF);
  for ( int i = 0 ; i  n; ++i ) {

mask = _mm_slli_si64(mask, 8);
mask = _mm_slli_si64(mask, 8);

*dest = mask;
++dest;
  }
  __builtin_ia32_emms ();
}


-- 
   Summary: gcc ICE compiling code with mmx builtin
   Product: gcc
   Version: 4.4.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: rtl-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: davidxl at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36438



[Bug rtl-optimization/36438] gcc ICE compiling code with mmx builtin

2008-06-05 Thread davidxl at gcc dot gnu dot org


--- Comment #1 from davidxl at gcc dot gnu dot org  2008-06-05 06:41 ---

cse1 (RTL) does some expression simplification on the fly such as 

t = x  4
r = t  4

==
r = x  8

However for mmx shift operation, the mode (V1DI) for the const folding is
illegal -- resulting in ICE. 


-- 

davidxl at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||davidxl at gcc dot gnu dot
   ||org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36438



[Bug rtl-optimization/36438] gcc ICE compiling code with mmx builtin

2008-06-05 Thread davidxl at gcc dot gnu dot org


--- Comment #6 from davidxl at gcc dot gnu dot org  2008-06-05 17:37 ---
(In reply to comment #5)
 Patch at http://gcc.gnu.org/ml/gcc-patches/2008-06/msg00268.html
 

Thanks -- same as my local workaround.

David


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36438



[Bug c++/23383] builtin array operator new is not marked with malloc attribute

2008-06-04 Thread davidxl at gcc dot gnu dot org


--- Comment #13 from davidxl at gcc dot gnu dot org  2008-06-04 16:48 
---
(In reply to comment #12)
 Interesting things start to happen once you inline allocator functions as 
 well.
 See PR29286 and PR33407 which we still don't handle 100% correct.
 

I browsed through the two bugs -- it seems that compiler should get this right
regardless -- local pointer analysis should detect the must aliasing and should
overrule the type based aliasing decision when the placement new is inlined. If
not inlined, compiler should know the exact semantics of placement new (return
== arg), or treat it conservatively. 

David


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23383



[Bug c++/23383] builtin array operator new is not marked with malloc attribute

2008-06-04 Thread davidxl at gcc dot gnu dot org


--- Comment #15 from davidxl at gcc dot gnu dot org  2008-06-04 17:34 
---
(In reply to comment #14)
 We do the exact opposite - type-based rules override points-to must-alias
 information (or really may-alias information).  Also for the proposed scheme
 to work you need to guarantee that you always can compute correct points-to
 relations (I mean, if points-to information says pt_anything and if you
 then assume must-alias and thus a conflict then you simply disable TBAA
 completely).
 

Right, in general, type alias rules should override field and flow insensitive
pointer aliasing information as they really have very low confidence level
(especially for pt_anything case which is just a baseless guess) -- but 
precise/trustworthy aliasing info should be checked before assertion based
alias information and decide whether to proceed. 

For example:

if (no_alias_according_to_conservative_pointer_info) return no_alias;
if (no_alias_according_to_precise_pointer_info) return no_alias;
if (must_alias or definitely_may_alias) return may/must_alias;   (1)

// now proceed with type based rules, etc.


This is in theory. In practice, it can be tricky to tag the confidence level of
aliasing info.

David


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23383