[Bug tree-optimization/57742] memset(malloc(n),0,n) -> calloc(n,1)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742 Daniel Gutson changed: What|Removed |Added CC||daniel.gutson@tallertechnol ||ogies.com --- Comment #24 from Daniel Gutson --- This optimization breaks code, please see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67618
[Bug tree-optimization/57742] memset(malloc(n),0,n) - calloc(n,1)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742 --- Comment #17 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch --- (In reply to Marc Glisse from comment #16) Done. Joost, feel free to add your testcase from comment #3 if you want to (I can't write a hello world in fortran so I will avoid adding such testcases myself). Thanks Marc, I don't have write access, but I can try to dg-ify the testcase from comment #3.. however, first test, it still seems to contain a call to builtin_malloc at -O2, seems to work at -O3... expected ? Also, my nightly CP2K tester fails with : 0xa63a0f crash_signal ../../gcc/gcc/toplev.c:337 0x871f76 bb_seq_addr ../../gcc/gcc/gimple.h:1389 0x871f76 gsi_start_bb ../../gcc/gcc/gimple-iterator.h:118 0x871f76 gsi_for_stmt(gimple_statement_base*) ../../gcc/gcc/gimple-iterator.c:620 0xbfe1c1 handle_builtin_memset ../../gcc/gcc/tree-ssa-strlen.c:1653 0xbfe1c1 strlen_optimize_stmt ../../gcc/gcc/tree-ssa-strlen.c:1917 0xbfe1c1 strlen_dom_walker::before_dom_children(basic_block_def*) ../../gcc/gcc/tree-ssa-strlen.c:2096 0xfa483a dom_walker::walk(basic_block_def*) ../../gcc/gcc/domwalk.c:177 0xbf963d execute ../../gcc/gcc/tree-ssa-strlen.c:2170 Please submit a full bug report, which I suppose is related to this patch... I'll see if I can get a testcase.
[Bug tree-optimization/57742] memset(malloc(n),0,n) - calloc(n,1)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742 Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed: What|Removed |Added Status|RESOLVED|REOPENED Resolution|FIXED |--- --- Comment #18 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch --- The following now fails, so'll reopen this PR. It is at least related to zeroing pvec twice in a row, and doesn seem to happen if I manually inline the routine get_pseudo_param . cat bug.f90 MODULE atom_fit INTEGER, PARAMETER :: dp=8 CONTAINS SUBROUTINE atom_fit_pseudo () REAL(KIND=dp), ALLOCATABLE, DIMENSION(:) :: x, xi LOGICAL :: lsdpot ALLOCATE(xi(200),STAT=ierr) CALL get_pseudo_param(xi,lsdpot) CALL foo(xi) END SUBROUTINE atom_fit_pseudo SUBROUTINE get_pseudo_param (pvec,lsdpot) REAL(KIND=dp), DIMENSION(:), INTENT(out) :: pvec LOGICAL :: lsdpot IF(lsdpot) THEN pvec = 0 pvec = 0 END IF END SUBROUTINE get_pseudo_param END MODULE atom_fit gfortran -c -O3 bug.f90 bug.f90: In function ‘atom_fit_pseudo’: bug.f90:4:0: internal compiler error: Segmentation fault SUBROUTINE atom_fit_pseudo () ^ 0xa63a0f crash_signal ../../gcc/gcc/toplev.c:337 0x871f76 bb_seq_addr ../../gcc/gcc/gimple.h:1389 0x871f76 gsi_start_bb ../../gcc/gcc/gimple-iterator.h:118 0x871f76 gsi_for_stmt(gimple_statement_base*) ../../gcc/gcc/gimple-iterator.c:620 0xbfe1c1 handle_builtin_memset ../../gcc/gcc/tree-ssa-strlen.c:1653 0xbfe1c1 strlen_optimize_stmt ../../gcc/gcc/tree-ssa-strlen.c:1917 0xbfe1c1 strlen_dom_walker::before_dom_children(basic_block_def*) ../../gcc/gcc/tree-ssa-strlen.c:2096 0xfa483a dom_walker::walk(basic_block_def*) ../../gcc/gcc/domwalk.c:177 0xbf963d execute ../../gcc/gcc/tree-ssa-strlen.c:2170 Please submit a full bug report,
[Bug tree-optimization/57742] memset(malloc(n),0,n) - calloc(n,1)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742 --- Comment #19 from Marc Glisse glisse at gcc dot gnu.org --- (In reply to Joost VandeVondele from comment #17) Thanks Marc, I don't have write access, but I can try to dg-ify the testcase from comment #3.. however, first test, it still seems to contain a call to builtin_malloc at -O2, seems to work at -O3... expected ? Yes, at -O2 you don't have a call to memset, so my patch does nothing. It is the same as my C++ testcase basically, so we don't really need the extra testcase. Also, my nightly CP2K tester fails with : 0xa63a0f crash_signal ../../gcc/gcc/toplev.c:337 0x871f76 bb_seq_addr ../../gcc/gcc/gimple.h:1389 0x871f76 gsi_start_bb ../../gcc/gcc/gimple-iterator.h:118 0x871f76 gsi_for_stmt(gimple_statement_base*) ../../gcc/gcc/gimple-iterator.c:620 0xbfe1c1 handle_builtin_memset ../../gcc/gcc/tree-ssa-strlen.c:1653 0xbfe1c1 strlen_optimize_stmt ../../gcc/gcc/tree-ssa-strlen.c:1917 0xbfe1c1 strlen_dom_walker::before_dom_children(basic_block_def*) ../../gcc/gcc/tree-ssa-strlen.c:2096 0xfa483a dom_walker::walk(basic_block_def*) ../../gcc/gcc/domwalk.c:177 0xbf963d execute ../../gcc/gcc/tree-ssa-strlen.c:2170 Please submit a full bug report, which I suppose is related to this patch... I'll see if I can get a testcase. Yes, please open a new PR with the testcase and Cc: me, thanks.
[Bug tree-optimization/57742] memset(malloc(n),0,n) - calloc(n,1)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742 --- Comment #20 from Marc Glisse glisse at gcc dot gnu.org --- (In reply to Joost VandeVondele from comment #18) The following now fails, so'll reopen this PR. It is at least related to zeroing pvec twice in a row, and doesn seem to happen if I manually inline the routine get_pseudo_param . Hum, right, I thought I had tested that, but it was in an earlier version of the patch and I forgot to add it to one of the testcases :-( void*f(){ char*p=malloc(42); memset(p,0,42); memset(p,0,42); return p; };
[Bug tree-optimization/57742] memset(malloc(n),0,n) - calloc(n,1)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742 --- Comment #21 from Marc Glisse glisse at gcc dot gnu.org --- I am testing the following: --- tree-ssa-strlen.c(revision 211967) +++ tree-ssa-strlen.c(working copy) @@ -1646,20 +1646,22 @@ handle_builtin_memset (gimple_stmt_itera enum built_in_function code1 = DECL_FUNCTION_CODE (callee1); tree size = gimple_call_arg (stmt2, 2); if (code1 == BUILT_IN_CALLOC) /* Not touching stmt1 */ ; else if (code1 == BUILT_IN_MALLOC operand_equal_p (gimple_call_arg (stmt1, 0), size, 0)) { gimple_stmt_iterator gsi1 = gsi_for_stmt (stmt1); update_gimple_call (gsi1, builtin_decl_implicit (BUILT_IN_CALLOC), 2, size, build_one_cst (size_type_node)); + si1-length = build_int_cst (size_type_node, 0); + si1-stmt = gsi_stmt (gsi1); } else return true; tree lhs = gimple_call_lhs (stmt2); unlink_stmt_vdef (stmt2); if (lhs) { gimple assign = gimple_build_assign (lhs, ptr); gsi_replace (gsi, assign, false); }
[Bug tree-optimization/57742] memset(malloc(n),0,n) - calloc(n,1)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742 --- Comment #22 from Marc Glisse glisse at gcc dot gnu.org --- Author: glisse Date: Wed Jun 25 12:27:13 2014 New Revision: 211977 URL: https://gcc.gnu.org/viewcvs?rev=211977root=gccview=rev Log: 2014-06-25 Marc Glisse marc.gli...@inria.fr PR tree-optimization/57742 gcc/ * tree-ssa-strlen.c (handle_builtin_memset): Update strinfo after replacing the statement. gcc/testsuite/ * gcc.dg/tree-ssa/calloc-3.c: New file. Added: trunk/gcc/testsuite/gcc.dg/tree-ssa/calloc-3.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-ssa-strlen.c
[Bug tree-optimization/57742] memset(malloc(n),0,n) - calloc(n,1)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742 Marc Glisse glisse at gcc dot gnu.org changed: What|Removed |Added Status|REOPENED|RESOLVED Resolution|--- |FIXED --- Comment #23 from Marc Glisse glisse at gcc dot gnu.org --- Fixed.
[Bug tree-optimization/57742] memset(malloc(n),0,n) - calloc(n,1)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742 --- Comment #15 from Marc Glisse glisse at gcc dot gnu.org --- Author: glisse Date: Tue Jun 24 18:50:00 2014 New Revision: 211956 URL: https://gcc.gnu.org/viewcvs?rev=211956root=gccview=rev Log: 2014-06-24 Marc Glisse marc.gli...@inria.fr PR tree-optimization/57742 gcc/ * tree-ssa-strlen.c (get_string_length): Ignore malloc. (handle_builtin_malloc, handle_builtin_memset): New functions. (strlen_optimize_stmt): Call them. * passes.def: Move strlen after loop+dom but before vrp. gcc/testsuite/ * g++.dg/tree-ssa/calloc.C: New testcase. * gcc.dg/tree-ssa/calloc-1.c: Likewise. * gcc.dg/tree-ssa/calloc-2.c: Likewise. * gcc.dg/strlenopt-9.c: Adapt. Added: trunk/gcc/testsuite/g++.dg/tree-ssa/calloc.C trunk/gcc/testsuite/gcc.dg/tree-ssa/calloc-1.c trunk/gcc/testsuite/gcc.dg/tree-ssa/calloc-2.c Modified: trunk/gcc/ChangeLog trunk/gcc/passes.def trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.dg/strlenopt-9.c trunk/gcc/tree-ssa-strlen.c
[Bug tree-optimization/57742] memset(malloc(n),0,n) - calloc(n,1)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742 Marc Glisse glisse at gcc dot gnu.org changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED Target Milestone|--- |4.10.0 --- Comment #16 from Marc Glisse glisse at gcc dot gnu.org --- Done. Joost, feel free to add your testcase from comment #3 if you want to (I can't write a hello world in fortran so I will avoid adding such testcases myself).
[Bug tree-optimization/57742] memset(malloc(n),0,n) - calloc(n,1)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742 Marc Glisse glisse at gcc dot gnu.org changed: What|Removed |Added Attachment #30981|0 |1 is obsolete|| Attachment #31003|0 |1 is obsolete|| --- Comment #14 from Marc Glisse glisse at gcc dot gnu.org --- Created attachment 32204 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32204action=edit New patch This seems to work. It also handles the fortran example from comment #3. With a comment before the new function and a testcase, it will be good to go to gcc-patches. Side note: at -O3, if I provide an inline version of operator new (see PR 59894), it handles std::vectorint(n). However, I had to provide a simple one (call malloc, if null throw). The one in libsupc++ is way too complicated (2 calls to malloc), and if I refactor it slightly so malloc only appears once, I end up with the following. The edge probabilities are strange (malloc fails in 95% of cases?), but mostly we have a PHI node with a single argument which hides the fact that the variables are the same. It is far from the first time I notice this, is there a real reason to keep those unary PHIs, or should we optimize them more aggressively? p_24 = mallocD.1405 (sz_20); if (p_24 == 0B) goto bb 7; else goto bb 11; ;;succ: 7 [95.5%] (TRUE_VALUE,EXECUTABLE) ;;11 [4.5%] (FALSE_VALUE,EXECUTABLE) ;; basic block 11, loop depth 0, count 0, freq 349, maybe hot ;;prev block 10, next block 12, flags: (NEW, REACHABLE) ;;pred: 10 [4.5%] (FALSE_VALUE,EXECUTABLE) # PT = { D.16587 } (escaped heap) # ALIGN = 8, MISALIGN = 0 # p_41 = PHI p_24(10) # .MEM_42 = VDEF .MEM_34 MEM[(struct _Vector_baseD.14156 *)p_2(D)]._M_implD.15030._M_startD.15032 = p_41; # PT = { D.16587 } (escaped heap) # ALIGN = 4, MISALIGN = 0 _19 = p_41 + sz_20; # .MEM_44 = VDEF .MEM_42 MEM[(struct _Vector_baseD.14156 *)p_2(D)]._M_implD.15030._M_end_of_storageD.15034 = _19; # .MEM_8 = VDEF .MEM_44 # USE = anything # CLB = anything memsetD.1000 (p_41, 0, sz_20);
[Bug tree-optimization/57742] memset(malloc(n),0,n) - calloc(n,1)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742 --- Comment #13 from Marc Glisse glisse at gcc dot gnu.org --- (In reply to Richard Biener from comment #12) Yes, the fact that the return value p cannot be equal to q inside the function is not exposable. Richard fixed this in PR 50262, yay! So this PR is worth working on again.
[Bug tree-optimization/57742] memset(malloc(n),0,n) - calloc(n,1)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742 Richard Biener rguenth at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2013-10-14 Ever confirmed|0 |1 --- Comment #2 from Richard Biener rguenth at gcc dot gnu.org --- (In reply to Marc Glisse from comment #1) Created attachment 30981 [details] basic patch This is a very limited version of this optimization. It is in simplify_builtin_call, so only triggers if malloc/calloc is SSA_NAME_DEF_STMT(gimple_vuse(memset_stmt)). However, generalizing it means we would need plenty of tests protecting against cases where the transformation would be wrong. Note that this transforms: p=malloc(n); if(cond)memset(p,0,n); into: p=calloc(n,1); cond; which is good if cond is p!=0 but may not always be so great otherwise. ;) post-dominator tests (or simply tests whether both calls are in the same basic-block ...). Also you can transform p = malloc (n); if (p) memset (p, 0, n); which might be a common-enough case to optimize for. I won't post this to gcc-patches, I think we want something more general (dereferencing a double* between the 2 statements shouldn't break it) or nothing. dereferencing a double wouldn't have a VDEF (unless you store a double).
[Bug tree-optimization/57742] memset(malloc(n),0,n) - calloc(n,1)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742 Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed: What|Removed |Added CC||Joost.VandeVondele at mat dot ethz ||.ch --- Comment #3 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch --- just wondering if this would also catch a Fortran testcase, which compiles into malloc and memset: MODULE M REAL*8, DIMENSION(:), ALLOCATABLE, SAVE :: data INTEGER, PARAMETER :: n=2**16 CONTAINS SUBROUTINE TEST ALLOCATE(data(n)) data(:)=0 END SUBROUTINE END MODULE
[Bug tree-optimization/57742] memset(malloc(n),0,n) - calloc(n,1)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742 --- Comment #4 from Marc Glisse glisse at gcc dot gnu.org --- (In reply to Richard Biener from comment #2) (In reply to Marc Glisse from comment #1) This is a very limited version of this optimization. It is in simplify_builtin_call, so only triggers if malloc/calloc is SSA_NAME_DEF_STMT(gimple_vuse(memset_stmt)). However, generalizing it means we would need plenty of tests protecting against cases where the transformation would be wrong. Note that this transforms: p=malloc(n); if(cond)memset(p,0,n); into: p=calloc(n,1); cond; which is good if cond is p!=0 but may not always be so great otherwise. ;) post-dominator tests (or simply tests whether both calls are in the same basic-block ...). Same basic block is quite limited, and for the condition below we don't directly have post-domination, we would need post-domination between the bbs with gimple_cond and malloc, and the bb of memset with the landing block of the gimple_cond. But even finding the gimple_cond in: malloc; loop; cond; loop; memset; can be hard. I guess I'll have to limit my expectations a bit... Also you can transform p = malloc (n); if (p) memset (p, 0, n); which might be a common-enough case to optimize for. Yes, that's the goal. dereferencing a double wouldn't have a VDEF (unless you store a double). I do want to be able to store in between, so I think I have to walk the vdef chain. But as soon as I do that, I need to make sure that the writes are to places that can't alias, which complicates things a lot (and it can get a bit expensive in a function with many memset). Consider this program: #include vector void f(void*p,int n){ new(p)std::vectorint(n,0); } With -O3, we end up with: _27 = operator new (_26); MEM[(struct _Vector_base *)p_4(D)]._M_impl._M_start = _27; MEM[(struct _Vector_base *)p_4(D)]._M_impl._M_finish = _27; _16 = _27 + _26; MEM[(struct _Vector_base *)p_4(D)]._M_impl._M_end_of_storage = _16; __builtin_memset (_27, 0, _26); which has memory stores between the allocation and memset. That's exactly the type of code where I'd want the optimization to apply. Joost's example has the same pattern: malloc, test for 0, several unrelated memory stores, memset. (how to handle the fact that we have operator new and not malloc is a different issue, I am thinking of having a mode/flag where we promise not to replace operator new so it can be inlined, which will include an if(p!=0) test) It would be great (in particular for application-specific plugins) to have an easy way to say things like: this is the next read/write use of this memory region (but other memory regions may be used in between), and it isn't post-dominated only because of this gimple_cond, etc. It's almost noon, too late to be dreaming ;-)
[Bug tree-optimization/57742] memset(malloc(n),0,n) - calloc(n,1)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742 --- Comment #5 from Richard Biener rguenth at gcc dot gnu.org --- (In reply to Marc Glisse from comment #4) (In reply to Richard Biener from comment #2) (In reply to Marc Glisse from comment #1) This is a very limited version of this optimization. It is in simplify_builtin_call, so only triggers if malloc/calloc is SSA_NAME_DEF_STMT(gimple_vuse(memset_stmt)). However, generalizing it means we would need plenty of tests protecting against cases where the transformation would be wrong. Note that this transforms: p=malloc(n); if(cond)memset(p,0,n); into: p=calloc(n,1); cond; which is good if cond is p!=0 but may not always be so great otherwise. ;) post-dominator tests (or simply tests whether both calls are in the same basic-block ...). Same basic block is quite limited, and for the condition below we don't directly have post-domination, we would need post-domination between the bbs with gimple_cond and malloc, and the bb of memset with the landing block of the gimple_cond. But even finding the gimple_cond in: malloc; loop; cond; loop; memset; can be hard. I guess I'll have to limit my expectations a bit... Also you can transform p = malloc (n); if (p) memset (p, 0, n); which might be a common-enough case to optimize for. Yes, that's the goal. dereferencing a double wouldn't have a VDEF (unless you store a double). I do want to be able to store in between, so I think I have to walk the vdef chain. But as soon as I do that, I need to make sure that the writes are to places that can't alias, which complicates things a lot (and it can get a bit expensive in a function with many memset). Consider this program: #include vector void f(void*p,int n){ new(p)std::vectorint(n,0); } With -O3, we end up with: _27 = operator new (_26); MEM[(struct _Vector_base *)p_4(D)]._M_impl._M_start = _27; MEM[(struct _Vector_base *)p_4(D)]._M_impl._M_finish = _27; _16 = _27 + _26; MEM[(struct _Vector_base *)p_4(D)]._M_impl._M_end_of_storage = _16; __builtin_memset (_27, 0, _26); which has memory stores between the allocation and memset. That's exactly the type of code where I'd want the optimization to apply. Joost's example has the same pattern: malloc, test for 0, several unrelated memory stores, memset. We have walk_aliased_vdefs for this. Basically the first callback you receive has to be the malloc, otherwise there is an aliasing stmt inbetween. Initialize the ao_ref with ao_ref_init_from_ptr_and_size. (how to handle the fact that we have operator new and not malloc is a different issue, I am thinking of having a mode/flag where we promise not to replace operator new so it can be inlined, which will include an if(p!=0) test) It would be great (in particular for application-specific plugins) to have an easy way to say things like: this is the next read/write use of this memory region (but other memory regions may be used in between), and it isn't post-dominated only because of this gimple_cond, etc. It's almost noon, too late to be dreaming ;-) See above ;)
[Bug tree-optimization/57742] memset(malloc(n),0,n) - calloc(n,1)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742 --- Comment #6 from Marc Glisse glisse at gcc dot gnu.org --- (In reply to Richard Biener from comment #5) We have walk_aliased_vdefs for this. Basically the first callback you receive has to be the malloc, otherwise there is an aliasing stmt inbetween. Cool! Last time I looked into a similar optimization, I needed to look also at the memory reads, not just the writes, so it was significantly more complicated. walk_aliased_vdefs looks perfect here, both for malloc+memset where there is nothing to read before the memset, and for calloc+memset where reading before or after the memset returns the same :-)
[Bug tree-optimization/57742] memset(malloc(n),0,n) - calloc(n,1)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742 --- Comment #7 from Marc Glisse glisse at gcc dot gnu.org --- (In reply to Richard Biener from comment #5) We have walk_aliased_vdefs for this. Basically the first callback you receive has to be the malloc, otherwise there is an aliasing stmt inbetween. Initialize the ao_ref with ao_ref_init_from_ptr_and_size. Hmm, there is a problem with that: I don't get a callback for malloc. stmt_may_clobber_ref_p_1 only looks at the lhs of a call statement if it isn't an SSA_NAME, so it considers that p=malloc(n) does not clobber MEM_REF[p]. This kind of makes sense, it creates this memory, which is different from clobbering. I can look at the def_stmt of the first argument of memset to find the malloc, at least, but that doesn't help me with the memory checks. Also, for this testcase: void* f(int n,double*d){ int* p=__builtin_malloc(n); ++*d; __builtin_memset(p,0,n); return p; } I actually get a callback for the store in *d, which gcc believes might alias :-( For this example: void g(int*); void* f(int n){ int* p=__builtin_malloc(n); for(int i=0;i1;++i){ __builtin_memset(p,0,n); g(p); p[5]=10; } return p; } if I modify the aliasing machinery to make it believe that p=malloc does alias, malloc is the first callback. I haven't added the dominance checks, but I assume they will tell me that malloc dominates memset and memset postdominates malloc, although I still shouldn't do the transformation. Pretty depressed at this point...
[Bug tree-optimization/57742] memset(malloc(n),0,n) - calloc(n,1)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742 --- Comment #8 from Marc Glisse glisse at gcc dot gnu.org --- Created attachment 31003 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=31003action=edit walk_aliased_vdefs experiment Incomplete patch I used for my previous comment.
[Bug tree-optimization/57742] memset(malloc(n),0,n) - calloc(n,1)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57742 --- Comment #1 from Marc Glisse glisse at gcc dot gnu.org --- Created attachment 30981 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30981action=edit basic patch This is a very limited version of this optimization. It is in simplify_builtin_call, so only triggers if malloc/calloc is SSA_NAME_DEF_STMT(gimple_vuse(memset_stmt)). However, generalizing it means we would need plenty of tests protecting against cases where the transformation would be wrong. Note that this transforms: p=malloc(n); if(cond)memset(p,0,n); into: p=calloc(n,1); cond; which is good if cond is p!=0 but may not always be so great otherwise. I won't post this to gcc-patches, I think we want something more general (dereferencing a double* between the 2 statements shouldn't break it) or nothing.