Re: Do not produce empty try-finally statements
Hi, I tested the following patch. It makes inliner to ignore empty EH clenaup regions regadless they are internal or external in anticipation they will be removed. It also make tree-eh to not do any cleanups pre-inline. This is independent change, but I wanted to stress the code bit more. Here are stats of tramp3d compilation: Mainline: einline: 27894 inlined calls Inlined 31427 calls, eliminated 6496 functions textdata bss dec hex 575081 401082 576203 8cacb Patch to avoid EH regions with clobbers (the one I started thread with): einline: 27954 inlined calls Inlined 31512 calls, eliminated 6496 functions textdata bss dec hex 572117 401082 573239 8bf37 New patch: einline: 27954 inlined calls Inlined 31502 calls, eliminated 6494 functions textdata bss dec hex 579248 401082 580370 8db12 No exceptions: einline: 25528 inlined calls Inlined 29194 calls, eliminated 6527 functions textdata bss dec hex 551555 401082 552677 86ee5 It seems tha tthis patch still does bit worse than the original (wrong) proposed patch, but it gets things better than mainline. I wonder what other kinds of cleanups we are missing. Note that the number of inlines in no-exception case is lower probably because quite few inlines are dtors in EH cleanups. Also the text size includes EH tables, as usual for size. Bootrstrapped/regtested x86_64-linux. I plan to commit the ipa-inline-analysis.c change if there are no complains. Does the tree-eh change look resonable based on this thread? I agree WRT eh cleanup and -O0. Shall I make a patch? * tree-eh.c (optimize_clobbers): Do nothing before inlining * ipa-inline-analysis.c (clobber_only_eh_bb_p): New function. (estimate_function_body_sizes): Use it. Index: tree-eh.c === --- tree-eh.c (revision 206946) +++ tree-eh.c (working copy) @@ -3381,6 +3381,11 @@ edge_iterator ei; edge e; + /* Before inlining we do not want to optimize away clobbers that may become + internal when inlined. */ + if (optimize && !cfun->after_inlining) +return; + /* Only optimize anything if the bb contains at least one clobber, ends with resx (checked by caller), optionally contains some debug stmts or labels, or at most one __builtin_stack_restore Index: ipa-inline-analysis.c === --- ipa-inline-analysis.c (revision 206946) +++ ipa-inline-analysis.c (working copy) @@ -2347,6 +2347,56 @@ return NULL; } +/* Return true when the basic blocks contains only clobbers followed by RESX. + Such BBs are kept around to make removal of dead stores possible with + presence of EH and will be optimized out by optimize_clobbers later in the + game. + + NEED_EH is used to recurse in case the clobber has non-EH predecestors + that can be clobber only, too.. When it is false, the RESX is not necessary + on the end of basic block. */ + +static bool +clobber_only_eh_bb_p (basic_block bb, bool need_eh = true) +{ + gimple_stmt_iterator gsi = gsi_last_bb (bb); + edge_iterator ei; + edge e; + + if (need_eh) +{ + if (gsi_end_p (gsi)) + return false; + if (gimple_code (gsi_stmt (gsi)) != GIMPLE_RESX) +return false; + gsi_prev (&gsi); +} + else if (!single_succ_p (bb)) +return false; + + for (; !gsi_end_p (gsi); gsi_prev (&gsi)) +{ + gimple stmt = gsi_stmt (gsi); + if (is_gimple_debug (stmt)) + continue; + if (gimple_clobber_p (stmt)) + continue; + if (gimple_code (stmt) == GIMPLE_LABEL) + break; + return false; +} + + /* See if all predecestors are either throws or clobber only BBs. */ + FOR_EACH_EDGE (e, ei, bb->preds) +if (!(e->flags & EDGE_EH) + && !clobber_only_eh_bb_p (e->src, false)) + return false; + + if (!need_eh) +debug_bb (bb); + return true; +} + /* Compute function body size parameters for NODE. When EARLY is true, we compute only simple summaries without non-trivial predicates to drive the early inliner. */ @@ -2410,6 +2460,14 @@ { bb = BASIC_BLOCK_FOR_FN (cfun, order[n]); freq = compute_call_stmt_bb_frequency (node->decl, bb); + if (clobber_only_eh_bb_p (bb)) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, "\n Ignoring BB %i;" +" it will be optimized away by cleanup_clobbers\n", +bb->index); + continue; + } /* TODO: Obviously predicates can be propagated down across CFG. */ if (parms_info)
Re: Do not produce empty try-finally statements
Hi, On Wed, 29 Jan 2014, Jakub Jelinek wrote: > > > It is also problem of inliner quality decisions and memory > > > use/compile time. The in-memory representation of unnecesary EH is > > > quite big. > > > > > > I am quite ignorant in this area, but for -O0 can't we simply > > > disable all clobbers? > > > > That sounds reasonable, yeah. > > Don't we need stack sharing even for -O0? Otherwise we could end up with > too big stack frames in some cases. Well, where to draw the line? Using -O0 could also end up with too slow code in some cases, still we don't enable optimizations. > Running ehcleanup1 for -O0, even if limited to just removal of clobbers, > seems better to me. OTOH -O0 is also supposed to be quick compilation... Well, I have no hard opinions there. Ciao, Michael.
Re: Do not produce empty try-finally statements
> But that will badly regress the stack usage, won't it? If so, that's a > blocker for Ada where we have big temporaries. For example this will presumably cause gnat.dg/stack_usage1.adb to fail. -- Eric Botcazou
Re: Do not produce empty try-finally statements
> That sounds reasonable, yeah. But that will badly regress the stack usage, won't it? If so, that's a blocker for Ada where we have big temporaries. -- Eric Botcazou
Re: Do not produce empty try-finally statements
On Wed, Jan 29, 2014 at 03:06:45PM +0100, Michael Matz wrote: > Hi, > > On Tue, 28 Jan 2014, Jan Hubicka wrote: > > > It is also problem of inliner quality decisions and memory use/compile > > time. The in-memory representation of unnecesary EH is quite big. > > > > I am quite ignorant in this area, but for -O0 can't we simply disable > > all clobbers? > > That sounds reasonable, yeah. Don't we need stack sharing even for -O0? Otherwise we could end up with too big stack frames in some cases. Running ehcleanup1 for -O0, even if limited to just removal of clobbers, seems better to me. Jakub
Re: Do not produce empty try-finally statements
Hi, On Tue, 28 Jan 2014, Jan Hubicka wrote: > It is also problem of inliner quality decisions and memory use/compile > time. The in-memory representation of unnecesary EH is quite big. > > I am quite ignorant in this area, but for -O0 can't we simply disable > all clobbers? That sounds reasonable, yeah. Ciao, Michael.
Re: Do not produce empty try-finally statements
> Hi, > > On Tue, 28 Jan 2014, Richard Biener wrote: > > > >> The EH optimizations involving cleanups with only clobbers in them > > >> are that if at the end of the cleanup after only CLOBBER stmts you > > >> would rethrow the exception externally, then the clobber isn't needed > > >> and the whole cleanup can be removed. And, if it rethrows somewhere > > >> internally, we can move the clobber stmts to the landing pad of > > >> wherever it would be caught. > > > > > > OK, I still do not see how ehclanup1 can then safely remove them > > > pre-inline given that the whole function body can be inlined into > > > another containing the outer EH region. > > > > That's true. > > Yes, and I think they should be removed only after inlining. In that case I think I shoud look into detecting clobber only EH in inliner and do not account it into the size/time estimates. I always wondered why tramp3d becomes so harder for inliner with EH enabled, I seem to get it now ;) > (Alternatively the inliner could be extended to add clobbers when changing > an external-throw into an internal-throw, but well, ...) > > > > If this is valid, why we can not just eliminate EH in those outer > > > clobber try..finally as part of ehlowering earlier? > > > > Probably we'd miss too many inlining cases from early inlining and the > > ehcleanup1 time is just a heuristic that works good enough for us? > > No, removing even more clobbers would remove even more stack slot sharing. > If anything we should remove _less_ regions (as in the precondition for > Honzas sentence above, "If this is valid, ..." simply is false). AFAIU > this all is just a problem with O0 code quality. So, there's the obvious > solution: run ehcleanup also for O0. It is also problem of inliner quality decisions and memory use/compile time. The in-memory representation of unnecesary EH is quite big. I am quite ignorant in this area, but for -O0 can't we simply disable all clobbers? Honza > > > Ciao, > Michael.
Re: Do not produce empty try-finally statements
Hi, On Tue, 28 Jan 2014, Jakub Jelinek wrote: > On Tue, Jan 28, 2014 at 01:48:21PM +0100, Michael Matz wrote: > > Hi, > > > > On Tue, 28 Jan 2014, Jakub Jelinek wrote: > > > > > There are two kinds of clobbers, the direct ones, which surely can be > > > safely removed by ehcleanup1 if they are the only reason why there is a > > > landing pad which will be rethrown outside of the current function, > > > > You can only safely (as in, not introducing false conflicts for stack > > slots) remove them before inlining, _if_ the inliner would add them back > > in ... > > So it is just an optimization thing, not correctness, right? Of course, it always was. It's "safe" to remove all clobbers always. But then it's also pointless to generate them. For them to make sense they must survive until expansion. > We've been doing it this way since 4.7.0. > > > > For the direct cleanups, the important thing is that they are necessarily > > > local variables in the current function, so when returning from that > > > function, whether normally or abnormally, the inliner still has all the > > > info > > > about them. If it wants, it can add the corresponding clobbers itself > > > > ... which it doesn't. > > So, if we want to improve it, we should do that in the inliner. That'd we ideal, yes. > Because, not removing them in the ehcleanup1 means the inliner will see far > more complex functions than really necessary. I'll dispute the "far", but yeah. Ciao, Michael.
Re: Do not produce empty try-finally statements
On Tue, Jan 28, 2014 at 01:48:21PM +0100, Michael Matz wrote: > Hi, > > On Tue, 28 Jan 2014, Jakub Jelinek wrote: > > > There are two kinds of clobbers, the direct ones, which surely can be > > safely removed by ehcleanup1 if they are the only reason why there is a > > landing pad which will be rethrown outside of the current function, > > You can only safely (as in, not introducing false conflicts for stack > slots) remove them before inlining, _if_ the inliner would add them back > in ... So it is just an optimization thing, not correctness, right? We've been doing it this way since 4.7.0. > > For the direct cleanups, the important thing is that they are necessarily > > local variables in the current function, so when returning from that > > function, whether normally or abnormally, the inliner still has all the info > > about them. If it wants, it can add the corresponding clobbers itself > > ... which it doesn't. So, if we want to improve it, we should do that in the inliner. Because, not removing them in the ehcleanup1 means the inliner will see far more complex functions than really necessary. Jakub
Re: Do not produce empty try-finally statements
Hi, On Tue, 28 Jan 2014, Jakub Jelinek wrote: > There are two kinds of clobbers, the direct ones, which surely can be > safely removed by ehcleanup1 if they are the only reason why there is a > landing pad which will be rethrown outside of the current function, You can only safely (as in, not introducing false conflicts for stack slots) remove them before inlining, _if_ the inliner would add them back in ... > For the direct cleanups, the important thing is that they are necessarily > local variables in the current function, so when returning from that > function, whether normally or abnormally, the inliner still has all the info > about them. If it wants, it can add the corresponding clobbers itself ... which it doesn't. > (not sure if it does or doesn't bother, but if it doesn't bother right > now, it certainly could add those in the future if it proves to be > important). Ciao, Michael.
Re: Do not produce empty try-finally statements
Hi, On Tue, 28 Jan 2014, Richard Biener wrote: > >> The EH optimizations involving cleanups with only clobbers in them > >> are that if at the end of the cleanup after only CLOBBER stmts you > >> would rethrow the exception externally, then the clobber isn't needed > >> and the whole cleanup can be removed. And, if it rethrows somewhere > >> internally, we can move the clobber stmts to the landing pad of > >> wherever it would be caught. > > > > OK, I still do not see how ehclanup1 can then safely remove them > > pre-inline given that the whole function body can be inlined into > > another containing the outer EH region. > > That's true. Yes, and I think they should be removed only after inlining. (Alternatively the inliner could be extended to add clobbers when changing an external-throw into an internal-throw, but well, ...) > > If this is valid, why we can not just eliminate EH in those outer > > clobber try..finally as part of ehlowering earlier? > > Probably we'd miss too many inlining cases from early inlining and the > ehcleanup1 time is just a heuristic that works good enough for us? No, removing even more clobbers would remove even more stack slot sharing. If anything we should remove _less_ regions (as in the precondition for Honzas sentence above, "If this is valid, ..." simply is false). AFAIU this all is just a problem with O0 code quality. So, there's the obvious solution: run ehcleanup also for O0. Ciao, Michael.
Re: Do not produce empty try-finally statements
On Tue, Jan 28, 2014 at 11:48:16AM +0100, Richard Biener wrote: > On Mon, Jan 20, 2014 at 5:22 PM, Jan Hubicka wrote: > >> > >> Yes. Say, this could be surrounded by some try/catch, if we do it the > >> first > >> way, a would be still considered live across the EH path to whatever > >> catches > >> it. > >> > >> The EH optimizations involving cleanups with only clobbers in them are that > >> if at the end of the cleanup after only CLOBBER stmts you would rethrow > >> the exception > >> externally, then the clobber isn't needed and the whole cleanup can be > >> removed. And, if it rethrows somewhere internally, we can move the clobber > >> stmts to the landing pad of wherever it would be caught. > > > > OK, I still do not see how ehclanup1 can then safely remove them pre-inline > > given that the whole function body can be inlined into another containing > > the > > outer EH region. > > That's true. There are two kinds of clobbers, the direct ones, which surely can be safely removed by ehcleanup1 if they are the only reason why there is a landing pad which will be rethrown outside of the current function, and then the indirect ones, meant primarily for C++ destructors on *this, which are just heuristics and current state seems to be working good enough for them IMHO. For the direct cleanups, the important thing is that they are necessarily local variables in the current function, so when returning from that function, whether normally or abnormally, the inliner still has all the info about them. If it wants, it can add the corresponding clobbers itself (not sure if it does or doesn't bother, but if it doesn't bother right now, it certainly could add those in the future if it proves to be important). For the indirect clobbers, typically if you inline some destructor, you either inline it into another destructor which will also have (a larger) clobber for *this afterwards, or into a function that defines the var as local and there will be a normal direct clobber for it. Jakub
Re: Do not produce empty try-finally statements
On Mon, Jan 20, 2014 at 5:22 PM, Jan Hubicka wrote: >> >> Yes. Say, this could be surrounded by some try/catch, if we do it the first >> way, a would be still considered live across the EH path to whatever catches >> it. >> >> The EH optimizations involving cleanups with only clobbers in them are that >> if at the end of the cleanup after only CLOBBER stmts you would rethrow the >> exception >> externally, then the clobber isn't needed and the whole cleanup can be >> removed. And, if it rethrows somewhere internally, we can move the clobber >> stmts to the landing pad of wherever it would be caught. > > OK, I still do not see how ehclanup1 can then safely remove them pre-inline > given that the whole function body can be inlined into another containing the > outer EH region. That's true. > If this is valid, why we can not just eliminate EH in those > outer clobber try..finally as part of ehlowering earlier? Probably we'd miss too many inlining cases from early inlining and the ehcleanup1 time is just a heuristic that works good enough for us? Jakub? Micha? Thanks, Richard. > Honza >> >> Jakub
Re: Do not produce empty try-finally statements
> > Yes. Say, this could be surrounded by some try/catch, if we do it the first > way, a would be still considered live across the EH path to whatever catches > it. > > The EH optimizations involving cleanups with only clobbers in them are that > if at the end of the cleanup after only CLOBBER stmts you would rethrow the > exception > externally, then the clobber isn't needed and the whole cleanup can be > removed. And, if it rethrows somewhere internally, we can move the clobber > stmts to the landing pad of wherever it would be caught. OK, I still do not see how ehclanup1 can then safely remove them pre-inline given that the whole function body can be inlined into another containing the outer EH region. If this is valid, why we can not just eliminate EH in those outer clobber try..finally as part of ehlowering earlier? Honza > > Jakub
Re: Do not produce empty try-finally statements
On Mon, Jan 20, 2014 at 05:07:18PM +0100, Jan Hubicka wrote: > So I guess we can not modify giplifier to output > foo (&a.buf); > a.buf[6] = 1; > a.buf[7] = 8; > a = {CLOBBER}; > > instead of > > try > { > foo (&a.buf); > a.buf[6] = 1; > a.buf[7] = 8; > } > finally > { > a = {CLOBBER}; > } > > Whenever A is just CLOBBER because we care about dead store over the possible > EH path? Yes. Say, this could be surrounded by some try/catch, if we do it the first way, a would be still considered live across the EH path to whatever catches it. The EH optimizations involving cleanups with only clobbers in them are that if at the end of the cleanup after only CLOBBER stmts you would rethrow the exception externally, then the clobber isn't needed and the whole cleanup can be removed. And, if it rethrows somewhere internally, we can move the clobber stmts to the landing pad of wherever it would be caught. Jakub
Re: Do not produce empty try-finally statements
> >struct A { char buf[64]; }; > >void foo (char *); > >void test () > >{ > > { A a; foo (a.buf); a.buf[6] = 1; a.buf[7] = 8; } > > { A a; foo (a.buf); a.buf[6] = 1; a.buf[7] = 8; } > > { A a; foo (a.buf); a.buf[6] = 1; a.buf[7] = 8; } > > { A a; foo (a.buf); a.buf[6] = 1; a.buf[7] = 8; } > > { A a; foo (a.buf); a.buf[6] = 1; a.buf[7] = 8; } > > { A a; foo (a.buf); a.buf[6] = 1; a.buf[7] = 8; } > > { A a; foo (a.buf); a.buf[6] = 1; a.buf[7] = 8; } > >} > >where with your patch we don't DSE the a.buf[6] and a.buf[7] stores > >anymore. I see now. > > > >If optimize_clobbers isn't performed at -O0, perhaps we should consider > >performing at at -O0 (just that and not the other EH optimizations)? > > Or, as clobbers are removed during rtl expansion make sure we remove empty eh > stuff there? Removing all the clutter that empty EH causes is quite a lot of work. In Jakub's testcase we need to output clobber on the non-EH path foo (&a.buf); a.buf[6] = 1; a.buf[7] = 8; a ={v} {CLOBBER}; So I guess we can not modify giplifier to output foo (&a.buf); a.buf[6] = 1; a.buf[7] = 8; a = {CLOBBER}; instead of try { foo (&a.buf); a.buf[6] = 1; a.buf[7] = 8; } finally { a = {CLOBBER}; } Whenever A is just CLOBBER because we care about dead store over the possible EH path? I suppose this may make sense to do so even when this try...finally is toplevel to aid possible inlining. The clobbers on EH are however currently removed by ehcleanup1 that is before we do any DSE and inlining.. Of course I would not like to have empty EH statemetns with all the unwind logic flying trhough the inliner's code estimate logic or I will have to teach it to anticipate their removal. Honza
Re: Do not produce empty try-finally statements
Jakub Jelinek wrote: >On Sun, Jan 19, 2014 at 11:46:09PM +0100, Jan Hubicka wrote: >> Bootstrapped/regtested x86_64-linux, OK? > >This looks very wrong. By ignoring the clobbers in the cleanups you >ensure clobbers are hardly ever emitted, but the TRY/FINALLY is the way >how >to make sure the clobbers appear in all the places where they are >supposed >to be, so that DSE, expansion etc. can take advantage of them. >Just look at say: >struct A { char buf[64]; }; >void foo (char *); >void test () >{ > { A a; foo (a.buf); a.buf[6] = 1; a.buf[7] = 8; } > { A a; foo (a.buf); a.buf[6] = 1; a.buf[7] = 8; } > { A a; foo (a.buf); a.buf[6] = 1; a.buf[7] = 8; } > { A a; foo (a.buf); a.buf[6] = 1; a.buf[7] = 8; } > { A a; foo (a.buf); a.buf[6] = 1; a.buf[7] = 8; } > { A a; foo (a.buf); a.buf[6] = 1; a.buf[7] = 8; } > { A a; foo (a.buf); a.buf[6] = 1; a.buf[7] = 8; } >} >where with your patch we don't DSE the a.buf[6] and a.buf[7] stores >anymore. > >If optimize_clobbers isn't performed at -O0, perhaps we should consider >performing at at -O0 (just that and not the other EH optimizations)? Or, as clobbers are removed during rtl expansion make sure we remove empty eh stuff there? Richard. > Jakub
Re: Do not produce empty try-finally statements
On Sun, Jan 19, 2014 at 11:46:09PM +0100, Jan Hubicka wrote: > Bootstrapped/regtested x86_64-linux, OK? This looks very wrong. By ignoring the clobbers in the cleanups you ensure clobbers are hardly ever emitted, but the TRY/FINALLY is the way how to make sure the clobbers appear in all the places where they are supposed to be, so that DSE, expansion etc. can take advantage of them. Just look at say: struct A { char buf[64]; }; void foo (char *); void test () { { A a; foo (a.buf); a.buf[6] = 1; a.buf[7] = 8; } { A a; foo (a.buf); a.buf[6] = 1; a.buf[7] = 8; } { A a; foo (a.buf); a.buf[6] = 1; a.buf[7] = 8; } { A a; foo (a.buf); a.buf[6] = 1; a.buf[7] = 8; } { A a; foo (a.buf); a.buf[6] = 1; a.buf[7] = 8; } { A a; foo (a.buf); a.buf[6] = 1; a.buf[7] = 8; } { A a; foo (a.buf); a.buf[6] = 1; a.buf[7] = 8; } } where with your patch we don't DSE the a.buf[6] and a.buf[7] stores anymore. If optimize_clobbers isn't performed at -O0, perhaps we should consider performing at at -O0 (just that and not the other EH optimizations)? Jakub