https://gcc.gnu.org/bugzilla/show_bug.cgi?id=124137

--- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> ---
Samples: 1M of event 'cycles:Pu', Event count (approx.): 2536912561130          
Overhead       Samples  Command  Shared Object         Symbol                   
  37.25%        714641  cc1plus  cc1plus               [.]
ix86_get_dominator_for_reg(unsigned int, basic_block_def*)
  22.81%        437343  cc1plus  cc1plus               [.]
ix86_place_single_tls_call(rtx_def*, rtx_def*, x86_cse_kind, auto_bitmap&,
auto_bitmap&,
  15.99%        306757  cc1plus  cc1plus               [.]
dominated_by_p(cdi_direction, basic_block_def const*, basic_block_def const*)
   8.37%        160583  cc1plus  cc1plus               [.]
note_pattern_stores(rtx_def const*, void (*)(rtx_def*, rtx_def const*, void*),
void*)


I'll note the pass fails to compute dominators, so likely gets slow query.


static basic_block
ix86_get_dominator_for_reg (unsigned int regno, basic_block bb)
{ 
  basic_block set_bb;
  auto_bitmap set_bbs;

  /* Get all BBs which set REGNO and dominate the current BB from all
     DEFs of REGNO.  */
  for (df_ref def = DF_REG_DEF_CHAIN (regno);
       def;
       def = DF_REF_NEXT_REG (def))
    if (!DF_REF_IS_ARTIFICIAL (def)
        && !DF_REF_FLAGS_IS_SET (def, DF_REF_MAY_CLOBBER)
        && !DF_REF_FLAGS_IS_SET (def, DF_REF_MUST_CLOBBER))
      {
        set_bb = DF_REF_BB (def);
        if (dominated_by_p (CDI_DOMINATORS, bb, set_bb))
          bitmap_set_bit (set_bbs, set_bb->index);
      }

  bb = nearest_common_dominator_for_set (CDI_DOMINATORS, set_bbs);


This will, for regs with many defs, be very inefficent.  Doing this nested in

      /* If any live caller-saved registers aren't dead at the end of
         this basic block, get the basic block which dominates all
         basic blocks which set the remaining live registers.  */
      auto_bitmap set_bbs;
      bitmap_iterator bi;
      unsigned int id;
      EXECUTE_IF_SET_IN_BITMAP (live_caller_saved_regs, 0, id, bi)
        {     
          basic_block set_bb = ix86_get_dominator_for_reg (id, bb);
          bitmap_set_bit (set_bbs, set_bb->index);
        }     
      bb = nearest_common_dominator_for_set (CDI_DOMINATORS, set_bbs);

is of course going to be worse.  Those are hard-regs, with possibly many
defs.  And this iterates on 'bb'.  Ugh.

That looks algorithmically wrong designed.

Reply via email to