[Bug tree-optimization/56572] GCC generates non-optimal transactional code when inlining nested transaction.

2018-10-01 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56572

Eric Gallager  changed:

   What|Removed |Added

URL||http://gcc.gnu.org/ml/gcc-p
   ||atches/2013-12/msg01693.htm
   ||l
   Assignee|aldyh at gcc dot gnu.org   |unassigned at gcc dot 
gnu.org

--- Comment #11 from Eric Gallager  ---
(In reply to Eric Gallager from comment #10)
> (In reply to Aldy Hernandez from comment #9)
> > Created attachment 31787 [details]
> > removal of transactions from clones
> > 
> > This is a patch that fixes part of the problem, but as discussed in the
> > thread, is not sufficient since the uninstrumented code path will still end
> > up containing a nested transaction after inlining.
> > 
> > I am attaching this patch for further reference when this PR is picked up
> > again.  See thread for more details.
> 
> Are you still working on it?

No reply; taking that as a "no"

[Bug tree-optimization/56572] GCC generates non-optimal transactional code when inlining nested transaction.

2018-07-01 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56572

Eric Gallager  changed:

   What|Removed |Added

   Keywords||patch
 CC||egallager at gcc dot gnu.org

--- Comment #10 from Eric Gallager  ---
(In reply to Aldy Hernandez from comment #9)
> Created attachment 31787 [details]
> removal of transactions from clones
> 
> This is a patch that fixes part of the problem, but as discussed in the
> thread, is not sufficient since the uninstrumented code path will still end
> up containing a nested transaction after inlining.
> 
> I am attaching this patch for further reference when this PR is picked up
> again.  See thread for more details.

Are you still working on it?

[Bug tree-optimization/56572] GCC generates non-optimal transactional code when inlining nested transaction.

2014-01-09 Thread aldyh at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56572

--- Comment #9 from Aldy Hernandez aldyh at gcc dot gnu.org ---
Created attachment 31787
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=31787action=edit
removal of transactions from clones

This is a patch that fixes part of the problem, but as discussed in the thread,
is not sufficient since the uninstrumented code path will still end up
containing a nested transaction after inlining.

I am attaching this patch for further reference when this PR is picked up
again.  See thread for more details.


[Bug tree-optimization/56572] GCC generates non-optimal transactional code when inlining nested transaction.

2014-01-06 Thread aldyh at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56572

--- Comment #8 from Aldy Hernandez aldyh at gcc dot gnu.org ---
Proposed patch and subsequent discussions:
http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01693.html


[Bug tree-optimization/56572] GCC generates non-optimal transactional code when inlining nested transaction.

2013-12-12 Thread aldyh at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56572

--- Comment #7 from Aldy Hernandez aldyh at gcc dot gnu.org ---
Well, we could tweak the inliner cost model, thus causing the early inliner to
inline the nested transactions earlier.  With the attached patch, f() gets
inlined into g() early enough so that pass_ipa_tm sees the nested transaction
before the uninstrumented path has been added.

So with this patch, we could twiddle IPA tm to remove nested transactions
without it handling the unnecessarily complex uninstrumented/instrumented code
paths.  However, it seems to me that the proper IPA inliner may add other
(possibly nested) transactions, which will have us back to square one by tmmark
time (??).

This sounds like a good start, and if we find that the latter IPA inliner
creates other unnecessary nested transactions (in other test cases), we could
bite the bullet and do the whole thing in tmmark, handling both uninstrumented
and instrumented code paths.

Waddaya think?

diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index f42ade02..1cb9e58 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -3969,7 +3969,7 @@ init_inline_once (void)
   eni_size_weights.target_builtin_call_cost = 1;
   eni_size_weights.div_mod_cost = 1;
   eni_size_weights.omp_cost = 40;
-  eni_size_weights.tm_cost = 10;
+  eni_size_weights.tm_cost = 2;
   eni_size_weights.time_based = false;
   eni_size_weights.return_cost = 1;


[Bug tree-optimization/56572] GCC generates non-optimal transactional code when inlining nested transaction.

2013-12-11 Thread rth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56572

--- Comment #5 from Richard Henderson rth at gcc dot gnu.org ---
Yes, removing nested transactions (when possible) is worth doing.

The problem is not just with ipa_inline, but also with early_inline.
Indeed, for an example this small, I would have expected early_inline
to have already nested f into g between lower_tm and ipa_tm.

I think I'd always intended that ipa_tm delete nested transations as
it saw them, since _even if_ f isn't inlined into g, the transational
clone of f (which g should call) also should not have the nested
transaction.  I.e. inside a transactional clone we know we must be
within an outer transaction.

As a sort of pass-within-a-pass, we're already walking the transaction
tree, and the blocks therein, in order to transform calls.  I would
think it would be easy enough to add a similar pass-within-a-pass to
walk the transaction tree and delete nested transactions.  Hopefully
we could do this before actually creating the uninstrumented path.

I think moving the creation of the uninstrumented path after ipa_inline
is probably a mistake.  The inliner makes decisions e.g. based on the
number of calls to a function.  If we double the number of calls with
the creation of the uninst path, that could significantly change the
decisions that ipa_inline would make.


r~


[Bug tree-optimization/56572] GCC generates non-optimal transactional code when inlining nested transaction.

2013-12-11 Thread aldyh at redhat dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56572

--- Comment #6 from Aldy Hernandez aldyh at redhat dot com ---
 walk the transaction tree and delete nested transactions.  Hopefully
 we could do this before actually creating the uninstrumented path.

 I think moving the creation of the uninstrumented path after ipa_inline
 is probably a mistake.  The inliner makes decisions e.g. based on the

Ah I see.  Yes, I agree that the creation of the uninstrumented path 
after ipa_inline is not a good idea, but then you are contradicting 
yourself.  You want to delete nested transactions before creating the 
uninstrumented path (first quoted paragraph), but yet you don't want to 
move the uninstrumented path after ipa_inline (second quoted paragraph).

Do you mean remove the nested transactions after ipa_inline (possibly 
tmmark), but then just suck it up and handle the multiple 
instrumented/uninstrumented code paths in the CFG?

Or do you mean something else?

Or do you mean make early inling smarter so it inlines transactions 
properly?  Although perhaps something else could be added by ipa_inline, 
so we'd have the same problem.


[Bug tree-optimization/56572] GCC generates non-optimal transactional code when inlining nested transaction.

2013-12-10 Thread aldyh at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56572

Aldy Hernandez aldyh at gcc dot gnu.org changed:

   What|Removed |Added

 CC||rth at gcc dot gnu.org

--- Comment #4 from Aldy Hernandez aldyh at gcc dot gnu.org ---
At tmmark pass time, IPA inlining has already happened so we could conceivably
add code to the pass to remove nested transactions by accumulating all the BB's
in a nested candidate (nested transactions without a call to
__transaction_cancel), and manually removing the corresponding
GIMPLE_TRANSACTION and BUILT_IN_TM_COMMIT*'s.

The problem with this approach, however, is that the uninstrumented code path
has already been added (in pass_ipa_tm), prior to the IPA inlining pass, so the
CFG is a bit of a mess, with 2 transactions for each original transaction, as
well as with EDGE_TM_UNINSTRUMENTED edges linking things together.  So to do
this properly, we should take care of removing one of the two paths entirely
(instrumented or uninstrumented), as well as removing all the
GIMPLE_TRANSACTION/BUILT_IN_TM_COMMIT*'s therein.  [Note: It doesn't matter
which one, since tmmark has not run and there is no instrumentation in either
one.  The paths are identical at this point.]

This isn't impossible, but tedious, and I wonder if it's worth doing for
unnecessary nested transactions appearing after IPA inlining. ??

Another approach would be to move the uninstrumenting code (which is currently
in pass_ipa_tm, which runs before IPA inlining) into its own actual IPA pass
(not this small_ipa_pass business) and run it right after IPA inlining.  At
which point, right before doing the uninstrumentation edge thing, remove the
necessary GIMPLE_TRANSACTION/BUILT_IN_TM_COMMIT*'s, thus doing it before the
CFG becomes unnecessarily complex by the addition of the
instrumented/uninstrumented edges.

I'm not terribly fond of these approaches.  Does anyone see a cleaner approach?
 Am I missing something?  Is it even worth doing?


[Bug tree-optimization/56572] GCC generates non-optimal transactional code when inlining nested transaction.

2013-12-09 Thread aldyh at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56572

Aldy Hernandez aldyh at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2013-12-09
  Component|c   |tree-optimization
Version|trans-mem   |4.9.0
 Ever confirmed|0   |1

--- Comment #3 from Aldy Hernandez aldyh at gcc dot gnu.org ---
Confirmed.  The problem is also present in 4.9.

The problem here is that the code removing nested transactions runs as part of
the execute_lower_tm() pass:

  /* If there was absolutely nothing transaction related inside the
 transaction, we may elide it.  Likewise if this is a nested
 transaction and does not contain an abort.  */
  if (this_state == 0
  || (!(this_state  GTMA_HAVE_ABORT)  outer_state != NULL))

...whereas inlining happens in a subsequent pass.