https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93115

Jan Hubicka <hubicka at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |mjambor at suse dot cz

--- Comment #5 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
OK, the missed optimization follows from the following:

1) ipa-cp creates specialized node for o. It is called only once from fn3
   p calls unspecialized o. I wonder why this happens since both calls in p and
fn3 leads to devirtualization.
2) inliner inlines k to o. This does not enable devirtualization because o is
not specialized.
3) at the end of inlining remove_unreachable_nodes removes the offiline copy of
m::av
4) we inline o to p enabling devirtualization but it is too late.

Adding inline keyword to p makes inliner to inline it early but we still miss
the devirutalization. So we have two issues

a) for some reason ipa-cp rules out reasonable specialization
I it is decided here:
Evaluating opportunities for void o(j&)/11.                                     
 - considering value &g for param #0 p1 (caller_count: 1)                       
     good_cloning_opportunity_p (time: 1, size: 36, freq_sum: 1000) ->
evaluation: 27, threshold: 500
     good_cloning_opportunity_p (time: 199, size: 120, freq_sum: 1000) ->
evaluation: 1658, threshold: 500
  Creating a specialized node of void o(j&)/11.                                 
     the new node is o.constprop/37.                                            
     known ctx 0 is     Outer type:struct j offset 0                            
 - considering value &e.D.2397 for param #0 p1 (caller_count: 1)                
     good_cloning_opportunity_p (time: 1, size: 36, freq_sum: 202) ->
evaluation: 5, threshold: 500
     good_cloning_opportunity_p (time: 103, size: 120, freq_sum: 202) ->
evaluation: 173, threshold: 500
I assume it is because freq_sum is 202 instead of 1000 because call is
conditional, but that is really way too strict... 20% is outcome of:

Predictions for bb 2                                                            
  DS theory heuristics: 20.24%                                                  
  combined heuristics: 20.24%                                                   
  call heuristics of edge 2->3: 33.00%                                          
  early return (on trees) heuristics of edge 2->3: 34.00%                       
Predictions for bb 3  

which seems reasonable.

b) we do not devirtualize after inlining. We combine context correctly:

Polymorphic call context combine:    Speculative outer type:struct j (or a
derived type) at offset 0
With context:                        Outer type:struct m offset 0               
Updated as:                          Outer type:struct m offset 0 Speculative
outer type:struct j (or a derived type) at offset 0

but I am not sure why it does not trigger devirt at this stage. We also do not
need to have speculative outer type when we know outer type precisely, but it
is cosmetic issue.

Reply via email to