https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61034

--- Comment #13 from Richard Biener <rguenth at gcc dot gnu.org> ---
We arrive at different final optimizations depending on PUSH_ARGS_REVERSED
(see PR67203).  Current (GCC 6) final state is either 3 or 4 calls depending
on that.  And this is only because the final DCE (which removes malloc/free
pairs) needs some more DSE (which only follows DCE).  The late dce/dse
passes are the only ones with this particular odering, all other pairs
come the other way around which would end up removing all malloc/free pairs
in this (finally).

Of course DSE and DCE depend on each other so exchanging the last two
isn't a trivial surgery.  Ideally DSE would have at least a basic
DCE embedded or we'd finally merge both passes (given that DSE is quite
ad-hoc anyway).

Reply via email to