[Bug tree-optimization/57742] memset(malloc(n),0,n) -> calloc(n,1)

2015-09-17 Thread daniel.gutson at tallertechnologies dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742

Daniel Gutson  changed:

   What|Removed |Added

 CC||daniel.gutson@tallertechnol
   ||ogies.com

--- Comment #24 from Daniel Gutson  ---
This optimization breaks code, please see
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67618


[Bug tree-optimization/57742] memset(malloc(n),0,n) - calloc(n,1)

2014-06-25 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742

--- Comment #17 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
---
(In reply to Marc Glisse from comment #16)
 Done. Joost, feel free to add your testcase from comment #3 if you want to
 (I can't write a hello world in fortran so I will avoid adding such
 testcases myself).

Thanks Marc, I don't have write access, but I can try to dg-ify the testcase
from comment #3.. however, first test, it still seems to contain a call to
builtin_malloc at -O2, seems to work at -O3... expected ?


Also, my nightly CP2K tester fails with :

0xa63a0f crash_signal
../../gcc/gcc/toplev.c:337
0x871f76 bb_seq_addr
../../gcc/gcc/gimple.h:1389
0x871f76 gsi_start_bb
../../gcc/gcc/gimple-iterator.h:118
0x871f76 gsi_for_stmt(gimple_statement_base*)
../../gcc/gcc/gimple-iterator.c:620
0xbfe1c1 handle_builtin_memset
../../gcc/gcc/tree-ssa-strlen.c:1653
0xbfe1c1 strlen_optimize_stmt
../../gcc/gcc/tree-ssa-strlen.c:1917
0xbfe1c1 strlen_dom_walker::before_dom_children(basic_block_def*)
../../gcc/gcc/tree-ssa-strlen.c:2096
0xfa483a dom_walker::walk(basic_block_def*)
../../gcc/gcc/domwalk.c:177
0xbf963d execute
../../gcc/gcc/tree-ssa-strlen.c:2170
Please submit a full bug report,

which I suppose is related to this patch... I'll see if I can get a testcase.


[Bug tree-optimization/57742] memset(malloc(n),0,n) - calloc(n,1)

2014-06-25 Thread Joost.VandeVondele at mat dot ethz.ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742

Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|FIXED   |---

--- Comment #18 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
---
The following now fails, so'll reopen this PR. It is at least related to
zeroing pvec twice in a row, and doesn seem to happen if I manually inline the
routine get_pseudo_param .

 cat bug.f90
MODULE atom_fit
  INTEGER, PARAMETER :: dp=8
CONTAINS
  SUBROUTINE atom_fit_pseudo ()
REAL(KIND=dp), ALLOCATABLE, DIMENSION(:) :: x, xi
LOGICAL :: lsdpot
ALLOCATE(xi(200),STAT=ierr)
CALL get_pseudo_param(xi,lsdpot)
CALL foo(xi)
  END SUBROUTINE atom_fit_pseudo
  SUBROUTINE get_pseudo_param (pvec,lsdpot)
REAL(KIND=dp), DIMENSION(:), INTENT(out) :: pvec
LOGICAL :: lsdpot
IF(lsdpot) THEN
  pvec = 0
  pvec = 0
END IF
  END SUBROUTINE get_pseudo_param
END MODULE atom_fit

 gfortran -c -O3 bug.f90
bug.f90: In function ‘atom_fit_pseudo’:
bug.f90:4:0: internal compiler error: Segmentation fault
   SUBROUTINE atom_fit_pseudo ()
 ^
0xa63a0f crash_signal
../../gcc/gcc/toplev.c:337
0x871f76 bb_seq_addr
../../gcc/gcc/gimple.h:1389
0x871f76 gsi_start_bb
../../gcc/gcc/gimple-iterator.h:118
0x871f76 gsi_for_stmt(gimple_statement_base*)
../../gcc/gcc/gimple-iterator.c:620
0xbfe1c1 handle_builtin_memset
../../gcc/gcc/tree-ssa-strlen.c:1653
0xbfe1c1 strlen_optimize_stmt
../../gcc/gcc/tree-ssa-strlen.c:1917
0xbfe1c1 strlen_dom_walker::before_dom_children(basic_block_def*)
../../gcc/gcc/tree-ssa-strlen.c:2096
0xfa483a dom_walker::walk(basic_block_def*)
../../gcc/gcc/domwalk.c:177
0xbf963d execute
../../gcc/gcc/tree-ssa-strlen.c:2170
Please submit a full bug report,

[Bug tree-optimization/57742] memset(malloc(n),0,n) - calloc(n,1)

2014-06-25 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742

--- Comment #19 from Marc Glisse glisse at gcc dot gnu.org ---
(In reply to Joost VandeVondele from comment #17)
 Thanks Marc, I don't have write access, but I can try to dg-ify the testcase
 from comment #3.. however, first test, it still seems to contain a call to
 builtin_malloc at -O2, seems to work at -O3... expected ?

Yes, at -O2 you don't have a call to memset, so my patch does nothing. It is
the same as my C++ testcase basically, so we don't really need the extra
testcase.

 Also, my nightly CP2K tester fails with :
 
 0xa63a0f crash_signal
 ../../gcc/gcc/toplev.c:337
 0x871f76 bb_seq_addr
 ../../gcc/gcc/gimple.h:1389
 0x871f76 gsi_start_bb
 ../../gcc/gcc/gimple-iterator.h:118
 0x871f76 gsi_for_stmt(gimple_statement_base*)
 ../../gcc/gcc/gimple-iterator.c:620
 0xbfe1c1 handle_builtin_memset
 ../../gcc/gcc/tree-ssa-strlen.c:1653
 0xbfe1c1 strlen_optimize_stmt
 ../../gcc/gcc/tree-ssa-strlen.c:1917
 0xbfe1c1 strlen_dom_walker::before_dom_children(basic_block_def*)
 ../../gcc/gcc/tree-ssa-strlen.c:2096
 0xfa483a dom_walker::walk(basic_block_def*)
 ../../gcc/gcc/domwalk.c:177
 0xbf963d execute
 ../../gcc/gcc/tree-ssa-strlen.c:2170
 Please submit a full bug report,
 
 which I suppose is related to this patch... I'll see if I can get a testcase.

Yes, please open a new PR with the testcase and Cc: me, thanks.


[Bug tree-optimization/57742] memset(malloc(n),0,n) - calloc(n,1)

2014-06-25 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742

--- Comment #20 from Marc Glisse glisse at gcc dot gnu.org ---
(In reply to Joost VandeVondele from comment #18)
 The following now fails, so'll reopen this PR. It is at least related to
 zeroing pvec twice in a row, and doesn seem to happen if I manually inline
 the routine get_pseudo_param .

Hum, right, I thought I had tested that, but it was in an earlier version of
the patch and I forgot to add it to one of the testcases :-(

void*f(){
  char*p=malloc(42);
  memset(p,0,42);
  memset(p,0,42);
  return p;
};


[Bug tree-optimization/57742] memset(malloc(n),0,n) - calloc(n,1)

2014-06-25 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742

--- Comment #21 from Marc Glisse glisse at gcc dot gnu.org ---
I am testing the following:

--- tree-ssa-strlen.c(revision 211967)
+++ tree-ssa-strlen.c(working copy)
@@ -1646,20 +1646,22 @@ handle_builtin_memset (gimple_stmt_itera
   enum built_in_function code1 = DECL_FUNCTION_CODE (callee1);
   tree size = gimple_call_arg (stmt2, 2);
   if (code1 == BUILT_IN_CALLOC)
 /* Not touching stmt1 */ ;
   else if (code1 == BUILT_IN_MALLOC
 operand_equal_p (gimple_call_arg (stmt1, 0), size, 0))
 {
   gimple_stmt_iterator gsi1 = gsi_for_stmt (stmt1);
   update_gimple_call (gsi1, builtin_decl_implicit (BUILT_IN_CALLOC), 2,
   size, build_one_cst (size_type_node));
+  si1-length = build_int_cst (size_type_node, 0);
+  si1-stmt = gsi_stmt (gsi1);
 }
   else
 return true;
   tree lhs = gimple_call_lhs (stmt2);
   unlink_stmt_vdef (stmt2);
   if (lhs)
 {
   gimple assign = gimple_build_assign (lhs, ptr);
   gsi_replace (gsi, assign, false);
 }


[Bug tree-optimization/57742] memset(malloc(n),0,n) - calloc(n,1)

2014-06-25 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742

--- Comment #22 from Marc Glisse glisse at gcc dot gnu.org ---
Author: glisse
Date: Wed Jun 25 12:27:13 2014
New Revision: 211977

URL: https://gcc.gnu.org/viewcvs?rev=211977root=gccview=rev
Log:
2014-06-25  Marc Glisse  marc.gli...@inria.fr

PR tree-optimization/57742
gcc/
* tree-ssa-strlen.c (handle_builtin_memset): Update strinfo
after replacing the statement.
gcc/testsuite/
* gcc.dg/tree-ssa/calloc-3.c: New file.

Added:
trunk/gcc/testsuite/gcc.dg/tree-ssa/calloc-3.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-ssa-strlen.c


[Bug tree-optimization/57742] memset(malloc(n),0,n) - calloc(n,1)

2014-06-25 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742

Marc Glisse glisse at gcc dot gnu.org changed:

   What|Removed |Added

 Status|REOPENED|RESOLVED
 Resolution|--- |FIXED

--- Comment #23 from Marc Glisse glisse at gcc dot gnu.org ---
Fixed.


[Bug tree-optimization/57742] memset(malloc(n),0,n) - calloc(n,1)

2014-06-24 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742

--- Comment #15 from Marc Glisse glisse at gcc dot gnu.org ---
Author: glisse
Date: Tue Jun 24 18:50:00 2014
New Revision: 211956

URL: https://gcc.gnu.org/viewcvs?rev=211956root=gccview=rev
Log:
2014-06-24  Marc Glisse  marc.gli...@inria.fr

PR tree-optimization/57742
gcc/
* tree-ssa-strlen.c (get_string_length): Ignore malloc.
(handle_builtin_malloc, handle_builtin_memset): New functions.
(strlen_optimize_stmt): Call them.
* passes.def: Move strlen after loop+dom but before vrp.
gcc/testsuite/
* g++.dg/tree-ssa/calloc.C: New testcase.
* gcc.dg/tree-ssa/calloc-1.c: Likewise.
* gcc.dg/tree-ssa/calloc-2.c: Likewise.
* gcc.dg/strlenopt-9.c: Adapt.

Added:
trunk/gcc/testsuite/g++.dg/tree-ssa/calloc.C
trunk/gcc/testsuite/gcc.dg/tree-ssa/calloc-1.c
trunk/gcc/testsuite/gcc.dg/tree-ssa/calloc-2.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/passes.def
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.dg/strlenopt-9.c
trunk/gcc/tree-ssa-strlen.c


[Bug tree-optimization/57742] memset(malloc(n),0,n) - calloc(n,1)

2014-06-24 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742

Marc Glisse glisse at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |4.10.0

--- Comment #16 from Marc Glisse glisse at gcc dot gnu.org ---
Done. Joost, feel free to add your testcase from comment #3 if you want to (I
can't write a hello world in fortran so I will avoid adding such testcases
myself).


[Bug tree-optimization/57742] memset(malloc(n),0,n) - calloc(n,1)

2014-02-23 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742

Marc Glisse glisse at gcc dot gnu.org changed:

   What|Removed |Added

  Attachment #30981|0   |1
is obsolete||
  Attachment #31003|0   |1
is obsolete||

--- Comment #14 from Marc Glisse glisse at gcc dot gnu.org ---
Created attachment 32204
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32204action=edit
New patch

This seems to work. It also handles the fortran example from comment #3. With a
comment before the new function and a testcase, it will be good to go to
gcc-patches.

Side note: at -O3, if I provide an inline version of operator new (see PR
59894), it handles std::vectorint(n). However, I had to provide a simple one
(call malloc, if null throw). The one in libsupc++ is way too complicated (2
calls to malloc), and if I refactor it slightly so malloc only appears once,
I end up with the following. The edge probabilities are strange (malloc fails
in 95% of cases?), but mostly we have a PHI node with a single argument which
hides the fact that the variables are the same. It is far from the first time I
notice this, is there a real reason to keep those unary PHIs, or should we
optimize them more aggressively?

  p_24 = mallocD.1405 (sz_20);
  if (p_24 == 0B)
goto bb 7;
  else
goto bb 11;
;;succ:   7 [95.5%]  (TRUE_VALUE,EXECUTABLE)
;;11 [4.5%]  (FALSE_VALUE,EXECUTABLE)

;;   basic block 11, loop depth 0, count 0, freq 349, maybe hot
;;prev block 10, next block 12, flags: (NEW, REACHABLE)
;;pred:   10 [4.5%]  (FALSE_VALUE,EXECUTABLE)
  # PT = { D.16587 } (escaped heap)
  # ALIGN = 8, MISALIGN = 0
  # p_41 = PHI p_24(10)
  # .MEM_42 = VDEF .MEM_34
  MEM[(struct _Vector_baseD.14156 *)p_2(D)]._M_implD.15030._M_startD.15032 =
p_41;
  # PT = { D.16587 } (escaped heap)
  # ALIGN = 4, MISALIGN = 0
  _19 = p_41 + sz_20;
  # .MEM_44 = VDEF .MEM_42
  MEM[(struct _Vector_baseD.14156
*)p_2(D)]._M_implD.15030._M_end_of_storageD.15034 = _19;
  # .MEM_8 = VDEF .MEM_44
  # USE = anything 
  # CLB = anything 
  memsetD.1000 (p_41, 0, sz_20);


[Bug tree-optimization/57742] memset(malloc(n),0,n) - calloc(n,1)

2014-02-22 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742

--- Comment #13 from Marc Glisse glisse at gcc dot gnu.org ---
(In reply to Richard Biener from comment #12)
 Yes, the fact that the return value p cannot be equal to q inside the
 function is not exposable.

Richard fixed this in PR 50262, yay!
So this PR is worth working on again.


[Bug tree-optimization/57742] memset(malloc(n),0,n) - calloc(n,1)

2013-10-14 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742

Richard Biener rguenth at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2013-10-14
 Ever confirmed|0   |1

--- Comment #2 from Richard Biener rguenth at gcc dot gnu.org ---
(In reply to Marc Glisse from comment #1)
 Created attachment 30981 [details]
 basic patch
 
 This is a very limited version of this optimization. It is in
 simplify_builtin_call, so only triggers if malloc/calloc is
 SSA_NAME_DEF_STMT(gimple_vuse(memset_stmt)). However, generalizing it means
 we would need plenty of tests protecting against cases where the
 transformation would be wrong. Note that this transforms:
 p=malloc(n);
 if(cond)memset(p,0,n);
 into:
 p=calloc(n,1);
 cond;
 which is good if cond is p!=0 but may not always be so great otherwise.

;)  post-dominator tests (or simply tests whether both calls are in the
same basic-block ...).

Also you can transform

p = malloc (n);
if (p)
  memset (p, 0, n);

which might be a common-enough case to optimize for.

 I won't post this to gcc-patches, I think we want something more general
 (dereferencing a double* between the 2 statements shouldn't break it) or
 nothing.

dereferencing a double wouldn't have a VDEF (unless you store a double).


[Bug tree-optimization/57742] memset(malloc(n),0,n) - calloc(n,1)

2013-10-14 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742

Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed:

   What|Removed |Added

 CC||Joost.VandeVondele at mat dot 
ethz
   ||.ch

--- Comment #3 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
---
just wondering if this would also catch a Fortran testcase, which compiles into
malloc and memset:

MODULE M
  REAL*8, DIMENSION(:), ALLOCATABLE, SAVE :: data
  INTEGER, PARAMETER :: n=2**16
CONTAINS
  SUBROUTINE TEST
ALLOCATE(data(n))
data(:)=0
  END SUBROUTINE
END MODULE


[Bug tree-optimization/57742] memset(malloc(n),0,n) - calloc(n,1)

2013-10-14 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742

--- Comment #4 from Marc Glisse glisse at gcc dot gnu.org ---
(In reply to Richard Biener from comment #2)
 (In reply to Marc Glisse from comment #1)
  This is a very limited version of this optimization. It is in
  simplify_builtin_call, so only triggers if malloc/calloc is
  SSA_NAME_DEF_STMT(gimple_vuse(memset_stmt)). However, generalizing it means
  we would need plenty of tests protecting against cases where the
  transformation would be wrong. Note that this transforms:
  p=malloc(n);
  if(cond)memset(p,0,n);
  into:
  p=calloc(n,1);
  cond;
  which is good if cond is p!=0 but may not always be so great otherwise.
 
 ;)  post-dominator tests (or simply tests whether both calls are in the
 same basic-block ...).

Same basic block is quite limited, and for the condition below we don't
directly have post-domination, we would need post-domination between the bbs
with gimple_cond and malloc, and the bb of memset with the landing block of the
gimple_cond. But even finding the gimple_cond in: malloc; loop; cond; loop;
memset; can be hard. I guess I'll have to limit my expectations a bit...

 Also you can transform
 
 p = malloc (n);
 if (p)
   memset (p, 0, n);
 
 which might be a common-enough case to optimize for.

Yes, that's the goal.

 dereferencing a double wouldn't have a VDEF (unless you store a double).

I do want to be able to store in between, so I think I have to walk the vdef
chain. But as soon as I do that, I need to make sure that the writes are to
places that can't alias, which complicates things a lot (and it can get a bit
expensive in a function with many memset). Consider this program:

#include vector
void f(void*p,int n){ new(p)std::vectorint(n,0); }

With -O3, we end up with:

  _27 = operator new (_26);
  MEM[(struct _Vector_base *)p_4(D)]._M_impl._M_start = _27;
  MEM[(struct _Vector_base *)p_4(D)]._M_impl._M_finish = _27;
  _16 = _27 + _26;
  MEM[(struct _Vector_base *)p_4(D)]._M_impl._M_end_of_storage = _16;
  __builtin_memset (_27, 0, _26);

which has memory stores between the allocation and memset. That's exactly the
type of code where I'd want the optimization to apply. Joost's example has the
same pattern: malloc, test for 0, several unrelated memory stores, memset.

(how to handle the fact that we have operator new and not malloc is a different
issue, I am thinking of having a mode/flag where we promise not to replace
operator new so it can be inlined, which will include an if(p!=0) test)

It would be great (in particular for application-specific plugins) to have an
easy way to say things like: this is the next read/write use of this memory
region (but other memory regions may be used in between), and it isn't
post-dominated only because of this gimple_cond, etc. It's almost noon, too
late to be dreaming ;-)


[Bug tree-optimization/57742] memset(malloc(n),0,n) - calloc(n,1)

2013-10-14 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742

--- Comment #5 from Richard Biener rguenth at gcc dot gnu.org ---
(In reply to Marc Glisse from comment #4)
 (In reply to Richard Biener from comment #2)
  (In reply to Marc Glisse from comment #1)
   This is a very limited version of this optimization. It is in
   simplify_builtin_call, so only triggers if malloc/calloc is
   SSA_NAME_DEF_STMT(gimple_vuse(memset_stmt)). However, generalizing it 
   means
   we would need plenty of tests protecting against cases where the
   transformation would be wrong. Note that this transforms:
   p=malloc(n);
   if(cond)memset(p,0,n);
   into:
   p=calloc(n,1);
   cond;
   which is good if cond is p!=0 but may not always be so great otherwise.
  
  ;)  post-dominator tests (or simply tests whether both calls are in the
  same basic-block ...).
 
 Same basic block is quite limited, and for the condition below we don't
 directly have post-domination, we would need post-domination between the bbs
 with gimple_cond and malloc, and the bb of memset with the landing block of
 the gimple_cond. But even finding the gimple_cond in: malloc; loop; cond;
 loop; memset; can be hard. I guess I'll have to limit my expectations a
 bit...
 
  Also you can transform
  
  p = malloc (n);
  if (p)
memset (p, 0, n);
  
  which might be a common-enough case to optimize for.
 
 Yes, that's the goal.
 
  dereferencing a double wouldn't have a VDEF (unless you store a double).
 
 I do want to be able to store in between, so I think I have to walk the vdef
 chain. But as soon as I do that, I need to make sure that the writes are to
 places that can't alias, which complicates things a lot (and it can get a
 bit expensive in a function with many memset). Consider this program:
 
 #include vector
 void f(void*p,int n){ new(p)std::vectorint(n,0); }
 
 With -O3, we end up with:
 
   _27 = operator new (_26);
   MEM[(struct _Vector_base *)p_4(D)]._M_impl._M_start = _27;
   MEM[(struct _Vector_base *)p_4(D)]._M_impl._M_finish = _27;
   _16 = _27 + _26;
   MEM[(struct _Vector_base *)p_4(D)]._M_impl._M_end_of_storage = _16;
   __builtin_memset (_27, 0, _26);
 
 which has memory stores between the allocation and memset. That's exactly
 the type of code where I'd want the optimization to apply. Joost's example
 has the same pattern: malloc, test for 0, several unrelated memory stores,
 memset.

We have walk_aliased_vdefs for this.  Basically the first callback
you receive has to be the malloc, otherwise there is an aliasing
stmt inbetween.  Initialize the ao_ref with ao_ref_init_from_ptr_and_size.

 (how to handle the fact that we have operator new and not malloc is a
 different issue, I am thinking of having a mode/flag where we promise not to
 replace operator new so it can be inlined, which will include an if(p!=0)
 test)
 
 It would be great (in particular for application-specific plugins) to have
 an easy way to say things like: this is the next read/write use of this
 memory region (but other memory regions may be used in between), and it
 isn't post-dominated only because of this gimple_cond, etc. It's almost
 noon, too late to be dreaming ;-)

See above ;)


[Bug tree-optimization/57742] memset(malloc(n),0,n) - calloc(n,1)

2013-10-14 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742

--- Comment #6 from Marc Glisse glisse at gcc dot gnu.org ---
(In reply to Richard Biener from comment #5)
 We have walk_aliased_vdefs for this.  Basically the first callback
 you receive has to be the malloc, otherwise there is an aliasing
 stmt inbetween.

Cool! Last time I looked into a similar optimization, I needed to look also at
the memory reads, not just the writes, so it was significantly more
complicated. walk_aliased_vdefs looks perfect here, both for malloc+memset
where there is nothing to read before the memset, and for calloc+memset where
reading before or after the memset returns the same :-)


[Bug tree-optimization/57742] memset(malloc(n),0,n) - calloc(n,1)

2013-10-14 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742

--- Comment #7 from Marc Glisse glisse at gcc dot gnu.org ---
(In reply to Richard Biener from comment #5)
 We have walk_aliased_vdefs for this.  Basically the first callback
 you receive has to be the malloc, otherwise there is an aliasing
 stmt inbetween.  Initialize the ao_ref with ao_ref_init_from_ptr_and_size.

Hmm, there is a problem with that: I don't get a callback for malloc.
stmt_may_clobber_ref_p_1 only looks at the lhs of a call statement if it isn't
an SSA_NAME, so it considers that p=malloc(n) does not clobber MEM_REF[p]. This
kind of makes sense, it creates this memory, which is different from
clobbering. I can look at the def_stmt of the first argument of memset to find
the malloc, at least, but that doesn't help me with the memory checks.

Also, for this testcase:
void* f(int n,double*d){
  int* p=__builtin_malloc(n);
  ++*d;
  __builtin_memset(p,0,n);
  return p;
}
I actually get a callback for the store in *d, which gcc believes might alias
:-(

For this example:
void g(int*);
void* f(int n){
  int* p=__builtin_malloc(n);
  for(int i=0;i1;++i){
__builtin_memset(p,0,n);
g(p);
p[5]=10;
  }
  return p;
}
if I modify the aliasing machinery to make it believe that p=malloc does alias,
malloc is the first callback. I haven't added the dominance checks, but I
assume they will tell me that malloc dominates memset and memset postdominates
malloc, although I still shouldn't do the transformation.

Pretty depressed at this point...


[Bug tree-optimization/57742] memset(malloc(n),0,n) - calloc(n,1)

2013-10-14 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742

--- Comment #8 from Marc Glisse glisse at gcc dot gnu.org ---
Created attachment 31003
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=31003action=edit
walk_aliased_vdefs experiment

Incomplete patch I used for my previous comment.


[Bug tree-optimization/57742] memset(malloc(n),0,n) - calloc(n,1)

2013-10-11 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742

--- Comment #1 from Marc Glisse glisse at gcc dot gnu.org ---
Created attachment 30981
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30981action=edit
basic patch

This is a very limited version of this optimization. It is in
simplify_builtin_call, so only triggers if malloc/calloc is
SSA_NAME_DEF_STMT(gimple_vuse(memset_stmt)). However, generalizing it means we
would need plenty of tests protecting against cases where the transformation
would be wrong. Note that this transforms:
p=malloc(n);
if(cond)memset(p,0,n);
into:
p=calloc(n,1);
cond;
which is good if cond is p!=0 but may not always be so great otherwise.

I won't post this to gcc-patches, I think we want something more general
(dereferencing a double* between the 2 statements shouldn't break it) or
nothing.