On Wed, May 9, 2012 at 4:12 PM, Richard Guenther <richard.guent...@gmail.com> wrote: > On Tue, May 8, 2012 at 6:18 PM, Xinliang David Li <davi...@google.com> wrote: >> To be clear, this flag is for malloc implementation (such as tcmalloc) >> with side effect unknown to the compiler. Using -fno-builtin-xxx is >> too conservative for that purpose. > > I don't think that flys. Btw, the patch also guards alloca - alloca is purely > GCC internal. > > What's the "unknown side-effects" that are also important to preserve > for free(malloc(4))?
Malloc implementation may record some info to a global structure, and a program may use this free(malloc()) pair to simulate the real runs to get some data, such as peak memory requirement. Dehao > > Richard. > >> David >> >> On Tue, May 8, 2012 at 7:43 AM, Dehao Chen <de...@google.com> wrote: >>> Hello, >>> >>> This patch adds a flag to guard the optimization that optimize the >>> following code away: >>> >>> free (malloc (4)); >>> >>> In some cases, we'd like this type of malloc/free pairs to remain in >>> the optimized code. >>> >>> Tested with bootstrap, and no regression in the gcc testsuite. >>> >>> Is it ok for mainline? >>> >>> Thanks, >>> Dehao >>> >>> gcc/ChangeLog >>> 2012-05-08 Dehao Chen <de...@google.com> >>> >>> * common.opt (feliminate-malloc): New. >>> * doc/invoke.texi: Document it. >>> * tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Honor it. >>> >>> gcc/testsuite/ChangeLog >>> 2012-05-08 Dehao Chen <de...@google.com> >>> >>> * gcc.dg/free-malloc.c: Check if -fno-eliminate-malloc is working >>> as expected. >>> >>> Index: gcc/doc/invoke.texi >>> =================================================================== >>> --- gcc/doc/invoke.texi (revision 187277) >>> +++ gcc/doc/invoke.texi (working copy) >>> @@ -360,7 +360,8 @@ >>> -fcx-limited-range @gol >>> -fdata-sections -fdce -fdelayed-branch @gol >>> -fdelete-null-pointer-checks -fdevirtualize -fdse @gol >>> --fearly-inlining -fipa-sra -fexpensive-optimizations -ffat-lto-objects @gol >>> +-fearly-inlining -feliminate-malloc -fipa-sra -fexpensive-optimizations >>> @gol >>> +-ffat-lto-objects @gol >>> -ffast-math -ffinite-math-only -ffloat-store >>> -fexcess-precision=@var{style} @gol >>> -fforward-propagate -ffp-contract=@var{style} -ffunction-sections @gol >>> -fgcse -fgcse-after-reload -fgcse-las -fgcse-lm -fgraphite-identity @gol >>> @@ -6238,6 +6239,7 @@ >>> -fdefer-pop @gol >>> -fdelayed-branch @gol >>> -fdse @gol >>> +-feliminate-malloc @gol >>> -fguess-branch-probability @gol >>> -fif-conversion2 @gol >>> -fif-conversion @gol >>> @@ -6762,6 +6764,11 @@ >>> Perform dead store elimination (DSE) on RTL@. >>> Enabled by default at @option{-O} and higher. >>> >>> +@item -feliminate-malloc >>> +@opindex feliminate-malloc >>> +Eliminate unnecessary malloc/free pairs. >>> +Enabled by default at @option{-O} and higher. >>> + >>> @item -fif-conversion >>> @opindex fif-conversion >>> Attempt to transform conditional jumps into branch-less equivalents. This >>> Index: gcc/testsuite/gcc.dg/free-malloc.c >>> =================================================================== >>> --- gcc/testsuite/gcc.dg/free-malloc.c (revision 0) >>> +++ gcc/testsuite/gcc.dg/free-malloc.c (revision 0) >>> @@ -0,0 +1,12 @@ >>> +/* { dg-do compile } */ >>> +/* { dg-options "-O2 -fno-eliminate-malloc" } */ >>> +/* { dg-final { scan-assembler-times "malloc" 2} } */ >>> +/* { dg-final { scan-assembler-times "free" 2} } */ >>> + >>> +extern void * malloc (unsigned long); >>> +extern void free (void *); >>> + >>> +void test () >>> +{ >>> + free (malloc (10)); >>> +} >>> Index: gcc/common.opt >>> =================================================================== >>> --- gcc/common.opt (revision 187277) >>> +++ gcc/common.opt (working copy) >>> @@ -1474,6 +1474,10 @@ >>> Common Var(flag_dce) Init(1) Optimization >>> Use the RTL dead code elimination pass >>> >>> +feliminate-malloc >>> +Common Var(flag_eliminate_malloc) Init(1) Optimization >>> +Eliminate unnecessary malloc/free pairs >>> + >>> fdse >>> Common Var(flag_dse) Init(1) Optimization >>> Use the RTL dead store elimination pass >>> Index: gcc/tree-ssa-dce.c >>> =================================================================== >>> --- gcc/tree-ssa-dce.c (revision 187277) >>> +++ gcc/tree-ssa-dce.c (working copy) >>> @@ -309,6 +309,8 @@ >>> case BUILT_IN_CALLOC: >>> case BUILT_IN_ALLOCA: >>> case BUILT_IN_ALLOCA_WITH_ALIGN: >>> + if (!flag_eliminate_malloc) >>> + mark_stmt_necessary (stmt, true); >>> return; >>> >>> default:;