Hi,

Just some comments on the fuseable_load_p part, since that's what
we were discussing last time.

It looks like this now relies on:

Ajit Agarwal <aagar...@linux.ibm.com> writes:
> +      /* We use DF data flow because we change location rtx
> +      which is easier to find and modify.
> +      We use mix of rtl-ssa def-use and DF data flow
> +      where it is easier.  */
> +      df_chain_add_problem (DF_DU_CHAIN | DF_UD_CHAIN);
> +      df_analyze ();
> +      df_set_flags (DF_DEFER_INSN_RESCAN);

But please don't do this!  For one thing, building DU/UD chains
as well as rtl-ssa is really expensive in terms of compile time.
But more importantly, modifications need to happen via rtl-ssa
to ensure that the IL is kept up-to-date.  If we don't do that,
later fuse attempts will be based on stale data and so could
generate incorrect code.

> +// Check whether load can be fusable or not.
> +// Return true if fuseable otherwise false.
> +bool
> +rs6000_pair_fusion::fuseable_load_p (insn_info *info)
> +{
> +  for (auto def : info->defs())
> +    {
> +      auto set = dyn_cast<set_info *> (def);
> +      for (auto use1 : set->nondebug_insn_uses ())
> +     use1->set_is_live_out_use (true);
> +    }

What was the reason for adding this loop?

> +
> +  rtx_insn *rtl_insn = info ->rtl ();
> +  rtx body = PATTERN (rtl_insn);
> +  rtx dest_exp = SET_DEST (body);
> +
> +  if (REG_P (dest_exp) &&
> +      (DF_REG_DEF_COUNT (REGNO (dest_exp)) > 1

The rtl-ssa way of checking this is:

  crtl->ssa->is_single_dominating_def (...)

> +       || DF_REG_EQ_USE_COUNT (REGNO (dest_exp)) > 0))
> +    return  false;

Why are uses in notes a problem?  In the worst case, we should just be
able to remove the note instead.

> +
> +  rtx addr = XEXP (SET_SRC (body), 0);
> +
> +  if (GET_CODE (addr) == PLUS
> +      && XEXP (addr, 1) && CONST_INT_P (XEXP (addr, 1)))
> +    {
> +      if (INTVAL (XEXP (addr, 1)) == -16)
> +     return false;
> +  }

What's special about -16?

> +
> +  df_ref use;
> +  df_insn_info *insn_info = DF_INSN_INFO_GET (info->rtl ());
> +  FOR_EACH_INSN_INFO_DEF (use, insn_info)
> +    {
> +      struct df_link *def_link = DF_REF_CHAIN (use);
> +
> +      if (!def_link || !def_link->ref
> +       || DF_REF_IS_ARTIFICIAL (def_link->ref))
> +     continue;
> +
> +      while (def_link && def_link->ref)
> +     {
> +       rtx_insn *insn = DF_REF_INSN (def_link->ref);
> +       if (GET_CODE (PATTERN (insn)) == PARALLEL)
> +         return false;

Why do you need to skip PARALLELs?

> +
> +       rtx set = single_set (insn);
> +       if (set == NULL_RTX)
> +         return false;
> +
> +       rtx op0 = SET_SRC (set);
> +       rtx_code code = GET_CODE (op0);
> +
> +       // This check is added as register pairs are not generated
> +       // by RA for neg:V2DF (fma: V2DF (reg1)
> +       //                  (reg2)
> +       //                  (neg:V2DF (reg3)))
> +       if (GET_RTX_CLASS (code) == RTX_UNARY)
> +         return false;

What's special about (neg (fma ...))?

> +
> +       def_link = def_link->next;
> +     }
> +     }
> +  return true;
> +}

Thanks,
Richard

Reply via email to