[Bug tree-optimization/56572] GCC generates non-optimal transactional code when inlining nested transaction.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56572 Eric Gallager changed: What|Removed |Added URL||http://gcc.gnu.org/ml/gcc-p ||atches/2013-12/msg01693.htm ||l Assignee|aldyh at gcc dot gnu.org |unassigned at gcc dot gnu.org --- Comment #11 from Eric Gallager --- (In reply to Eric Gallager from comment #10) > (In reply to Aldy Hernandez from comment #9) > > Created attachment 31787 [details] > > removal of transactions from clones > > > > This is a patch that fixes part of the problem, but as discussed in the > > thread, is not sufficient since the uninstrumented code path will still end > > up containing a nested transaction after inlining. > > > > I am attaching this patch for further reference when this PR is picked up > > again. See thread for more details. > > Are you still working on it? No reply; taking that as a "no"
[Bug tree-optimization/56572] GCC generates non-optimal transactional code when inlining nested transaction.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56572 Eric Gallager changed: What|Removed |Added Keywords||patch CC||egallager at gcc dot gnu.org --- Comment #10 from Eric Gallager --- (In reply to Aldy Hernandez from comment #9) > Created attachment 31787 [details] > removal of transactions from clones > > This is a patch that fixes part of the problem, but as discussed in the > thread, is not sufficient since the uninstrumented code path will still end > up containing a nested transaction after inlining. > > I am attaching this patch for further reference when this PR is picked up > again. See thread for more details. Are you still working on it?
[Bug tree-optimization/56572] GCC generates non-optimal transactional code when inlining nested transaction.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56572 --- Comment #9 from Aldy Hernandez aldyh at gcc dot gnu.org --- Created attachment 31787 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=31787action=edit removal of transactions from clones This is a patch that fixes part of the problem, but as discussed in the thread, is not sufficient since the uninstrumented code path will still end up containing a nested transaction after inlining. I am attaching this patch for further reference when this PR is picked up again. See thread for more details.
[Bug tree-optimization/56572] GCC generates non-optimal transactional code when inlining nested transaction.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56572 --- Comment #8 from Aldy Hernandez aldyh at gcc dot gnu.org --- Proposed patch and subsequent discussions: http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01693.html
[Bug tree-optimization/56572] GCC generates non-optimal transactional code when inlining nested transaction.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56572 --- Comment #7 from Aldy Hernandez aldyh at gcc dot gnu.org --- Well, we could tweak the inliner cost model, thus causing the early inliner to inline the nested transactions earlier. With the attached patch, f() gets inlined into g() early enough so that pass_ipa_tm sees the nested transaction before the uninstrumented path has been added. So with this patch, we could twiddle IPA tm to remove nested transactions without it handling the unnecessarily complex uninstrumented/instrumented code paths. However, it seems to me that the proper IPA inliner may add other (possibly nested) transactions, which will have us back to square one by tmmark time (??). This sounds like a good start, and if we find that the latter IPA inliner creates other unnecessary nested transactions (in other test cases), we could bite the bullet and do the whole thing in tmmark, handling both uninstrumented and instrumented code paths. Waddaya think? diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c index f42ade02..1cb9e58 100644 --- a/gcc/tree-inline.c +++ b/gcc/tree-inline.c @@ -3969,7 +3969,7 @@ init_inline_once (void) eni_size_weights.target_builtin_call_cost = 1; eni_size_weights.div_mod_cost = 1; eni_size_weights.omp_cost = 40; - eni_size_weights.tm_cost = 10; + eni_size_weights.tm_cost = 2; eni_size_weights.time_based = false; eni_size_weights.return_cost = 1;
[Bug tree-optimization/56572] GCC generates non-optimal transactional code when inlining nested transaction.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56572 --- Comment #5 from Richard Henderson rth at gcc dot gnu.org --- Yes, removing nested transactions (when possible) is worth doing. The problem is not just with ipa_inline, but also with early_inline. Indeed, for an example this small, I would have expected early_inline to have already nested f into g between lower_tm and ipa_tm. I think I'd always intended that ipa_tm delete nested transations as it saw them, since _even if_ f isn't inlined into g, the transational clone of f (which g should call) also should not have the nested transaction. I.e. inside a transactional clone we know we must be within an outer transaction. As a sort of pass-within-a-pass, we're already walking the transaction tree, and the blocks therein, in order to transform calls. I would think it would be easy enough to add a similar pass-within-a-pass to walk the transaction tree and delete nested transactions. Hopefully we could do this before actually creating the uninstrumented path. I think moving the creation of the uninstrumented path after ipa_inline is probably a mistake. The inliner makes decisions e.g. based on the number of calls to a function. If we double the number of calls with the creation of the uninst path, that could significantly change the decisions that ipa_inline would make. r~
[Bug tree-optimization/56572] GCC generates non-optimal transactional code when inlining nested transaction.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56572 --- Comment #6 from Aldy Hernandez aldyh at redhat dot com --- walk the transaction tree and delete nested transactions. Hopefully we could do this before actually creating the uninstrumented path. I think moving the creation of the uninstrumented path after ipa_inline is probably a mistake. The inliner makes decisions e.g. based on the Ah I see. Yes, I agree that the creation of the uninstrumented path after ipa_inline is not a good idea, but then you are contradicting yourself. You want to delete nested transactions before creating the uninstrumented path (first quoted paragraph), but yet you don't want to move the uninstrumented path after ipa_inline (second quoted paragraph). Do you mean remove the nested transactions after ipa_inline (possibly tmmark), but then just suck it up and handle the multiple instrumented/uninstrumented code paths in the CFG? Or do you mean something else? Or do you mean make early inling smarter so it inlines transactions properly? Although perhaps something else could be added by ipa_inline, so we'd have the same problem.
[Bug tree-optimization/56572] GCC generates non-optimal transactional code when inlining nested transaction.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56572 Aldy Hernandez aldyh at gcc dot gnu.org changed: What|Removed |Added CC||rth at gcc dot gnu.org --- Comment #4 from Aldy Hernandez aldyh at gcc dot gnu.org --- At tmmark pass time, IPA inlining has already happened so we could conceivably add code to the pass to remove nested transactions by accumulating all the BB's in a nested candidate (nested transactions without a call to __transaction_cancel), and manually removing the corresponding GIMPLE_TRANSACTION and BUILT_IN_TM_COMMIT*'s. The problem with this approach, however, is that the uninstrumented code path has already been added (in pass_ipa_tm), prior to the IPA inlining pass, so the CFG is a bit of a mess, with 2 transactions for each original transaction, as well as with EDGE_TM_UNINSTRUMENTED edges linking things together. So to do this properly, we should take care of removing one of the two paths entirely (instrumented or uninstrumented), as well as removing all the GIMPLE_TRANSACTION/BUILT_IN_TM_COMMIT*'s therein. [Note: It doesn't matter which one, since tmmark has not run and there is no instrumentation in either one. The paths are identical at this point.] This isn't impossible, but tedious, and I wonder if it's worth doing for unnecessary nested transactions appearing after IPA inlining. ?? Another approach would be to move the uninstrumenting code (which is currently in pass_ipa_tm, which runs before IPA inlining) into its own actual IPA pass (not this small_ipa_pass business) and run it right after IPA inlining. At which point, right before doing the uninstrumentation edge thing, remove the necessary GIMPLE_TRANSACTION/BUILT_IN_TM_COMMIT*'s, thus doing it before the CFG becomes unnecessarily complex by the addition of the instrumented/uninstrumented edges. I'm not terribly fond of these approaches. Does anyone see a cleaner approach? Am I missing something? Is it even worth doing?
[Bug tree-optimization/56572] GCC generates non-optimal transactional code when inlining nested transaction.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56572 Aldy Hernandez aldyh at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2013-12-09 Component|c |tree-optimization Version|trans-mem |4.9.0 Ever confirmed|0 |1 --- Comment #3 from Aldy Hernandez aldyh at gcc dot gnu.org --- Confirmed. The problem is also present in 4.9. The problem here is that the code removing nested transactions runs as part of the execute_lower_tm() pass: /* If there was absolutely nothing transaction related inside the transaction, we may elide it. Likewise if this is a nested transaction and does not contain an abort. */ if (this_state == 0 || (!(this_state GTMA_HAVE_ABORT) outer_state != NULL)) ...whereas inlining happens in a subsequent pass.