On 09/09/2016 09:17 AM, Segher Boessenkool wrote:
On Thu, Sep 08, 2016 at 10:41:37AM -0600, Jeff Law wrote:
So can you expand on the malloc example a bit -- I'm pretty sure I
understand what you're trying to do, but a concrete example may help
Bernd and be useful for archival purposes.

Sure, but it's big (which is the problem :-) )
Yea :( But it's likely a very compelling example of real world code. It's almost certainly too big to turn into a testcase of any kind, but just some before/after annotated code would be helpful.

Ideally we'd have some smaller testcases we could put in the testsuite to ensure that the feature works over-time in the way intended would be helpful as well.


That's a later addition anyway and isn't necessary to do
shrink-wrapping in the first place.

No, it always did that, just not as often (it only duplicated straight-line
code before).
Presumably (I haven't looked yet), the duplication is so that we can
isolate one or more paths which in turn allows sinking the prologue
further on some of those paths.

It duplicates as many blocks as it needs to dup, to make as many exits
as possible reachable without *any* prologue/epilogue.

As the header comment before the older code says:

/* Try to perform a kind of shrink-wrapping, making sure the
   prologue/epilogue is emitted only around those parts of the
   function that require it.

   There will be exactly one prologue, and it will be executed either
   zero or one time, on any path.
Right. That's always been my understanding of the key driver for placement. There's exactly one and will be executed one time or none across all paths in the CFG.

Essentially this is comparable to PRE-like algorithms for placement of expression evaluations.

And in separate shrink-wrapping world, we're leaving that model behind and I think that's one of the big things I'm struggling with -- we may execute a prologue component more than once if I've read everything correctly.


 Depending on where the prologue is
   placed, some of the basic blocks can be reached via both paths with
   and without a prologue.  Such blocks will be duplicated here, and the
   edges changed to match.
Understood.

This really feels comparable to block duplication for the purposes of isolating a particular path through the CFG so that path can be modified without affecting the behavior of other paths through the CFG.

It's also directly comparable to block duplication to allow more aggressive code motion in PRE-like algorithms.

Jeff

Reply via email to