2016-06-21 15:48 GMT+03:00 H.J. Lu <hjl.to...@gmail.com>:
> On Mon, Jun 20, 2016 at 11:53 AM, H.J. Lu <hjl.to...@gmail.com> wrote:
>> On Mon, Jun 20, 2016 at 10:31 AM, Ilya Enkovich <enkovich....@gmail.com> 
>> wrote:
>>> On 20 Jun 09:45, H.J. Lu wrote:
>>>> On Mon, Jun 20, 2016 at 7:30 AM, Ilya Enkovich <enkovich....@gmail.com> 
>>>> wrote:
>>>> > 2016-06-20 16:39 GMT+03:00 Uros Bizjak <ubiz...@gmail.com>:
>>>> >> On Mon, Jun 20, 2016 at 1:55 PM, H.J. Lu <hongjiu...@intel.com> wrote:
>>>> >>> TImode register referenced in debug insn can be converted to V1TImode
>>>> >>> by scalar to vector optimization.  We need to convert a debug insn if
>>>> >>> it has a variable in a TImode register.
>>>> >>>
>>>> >>> Tested on x86-64.  OK for trunk?
>>>> >>
>>>> >> Ilya, does this approach look good to you? Also, does DImode STV
>>>> >> conversion need similar handling of debug insns?
>>>> >
>>>> > DImode conversion doesn't change register mode (i.e. never calls
>>>> > PUT_MODE for registers).  That keeps debug instructions valid.
>>>> >
>>>> > Overall I don't like the idea of having debug insns in candidates
>>>> > set and in chains.  Looks like it is possible to have a chain
>>>> > consisting of a debug insn only which is weird (otherwise I don't
>>>> > see why we may return false in timode_scalar_chain::convert_insn).
>>>>
>>>> Yes, it can happen:
>>>>
>>>> (insn 11 8 12 2 (parallel [
>>>>             (set (reg/v:TI 91 [ <retval> ])
>>>>                 (plus:TI (reg/v:TI 92 [ a ])
>>>>                     (reg/v:TI 96 [ b ])))
>>>>             (clobber (reg:CC 17 flags))
>>>>         ]) y.i:5 210 {*addti3_doubleword}
>>>>      (expr_list:REG_UNUSED (reg:CC 17 flags)
>>>>         (nil)))
>>>> (debug_insn 12 11 13 2 (var_location:TI w (reg/v:TI 91 [ <retval> ])) 
>>>> y.i:5 -1
>>>>      (nil))
>>>>
>>>>
>>>> > What about other possible register uses?  If debug insns are added
>>>> > to candidates then NONDEBUG_INSN_P check for uses in
>>>> > timode_check_non_convertible_regs becomes invalid, right?
>>>>
>>>> Debug insn has no impact on STV decision.  We just need to convert
>>>> register referenced in debug insn from V1TImode to TImode in
>>>> timode_scalar_chain::convert_insn.
>>>>
>>>> > If we have (or want) to fix some register uses then it's probably
>>>> > would be better to visit register uses when we convert its mode
>>>> > and make required fix-ups.  It seems better to me to not involve
>>>> > debug insns in analysis phase.
>>>>
>>>> Here is the updated patch to add debug insn, which references the
>>>> TImode register which will be converted to V1TImode to queue.
>>>> I am testing it now.
>>>>
>>>
>>> You still count and dump debug insns as optimized ones.  Also we
>>> try to use virtual functions to cover differences in DI and TI
>>> optimizations and introducing additional TARGET_64BIT in common
>>> STV code is undesirable.
>>>
>>> Also your conversion now depends on instructions processing order.
>>> You will fail to process debug insn before non-debug ones. Required
>>> order is not guaranteed because processing depends on instruction
>>> UIDs only.
>>>
>>> I propose to modify transformation phase only like in the patch
>>> (untested) below.  I rely on your code which assumes the only
>>> possible usage in debug insn is VAR_LOCATION.
>>>
>>> Thanks,
>>> Ilya
>>> --
>>> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
>>> index c5e5e12..ec955f0 100644
>>> --- a/gcc/config/i386/i386.c
>>> +++ b/gcc/config/i386/i386.c
>>> @@ -3139,6 +3139,7 @@ class timode_scalar_chain : public scalar_chain
>>>
>>>   private:
>>>    void mark_dual_mode_def (df_ref def);
>>> +  void fix_debug_reg_uses (rtx reg);
>>>    void convert_insn (rtx_insn *insn);
>>>    /* We don't convert registers to difference size.  */
>>>    void convert_registers () {}
>>> @@ -3790,6 +3791,34 @@ dimode_scalar_chain::convert_insn (rtx_insn *insn)
>>>    df_insn_rescan (insn);
>>>  }
>>>
>>> +/* Fix uses of converted REG in debug insns.  */
>>> +
>>> +void
>>> +timode_scalar_chain::fix_debug_reg_uses (rtx reg)
>>> +{
>>> +  df_ref ref;
>>> +  for (ref = DF_REG_USE_CHAIN (REGNO (reg)); ref; ref = DF_REF_NEXT_REG 
>>> (ref))
>>> +    {
>>> +      rtx_insn *insn = DF_REF_INSN (ref);
>>> +
>>> +      if (DEBUG_INSN_P (insn))
>>> +       {
>>> +         /* It must be a debug insn with a TImode variable in register.  */
>>> +         rtx val = PATTERN (insn);
>>> +         gcc_assert (GET_MODE (val) == TImode
>>> +                     && GET_CODE (val) == VAR_LOCATION);
>>> +         rtx loc = PAT_VAR_LOCATION_LOC (val);
>>> +         gcc_assert (REG_P (loc)
>>> +                     && GET_MODE (loc) == V1TImode);
>>> +         /* Convert V1TImode register, which has been updated by a SET
>>> +             insn before, to SUBREG TImode.  */
>>> +         PAT_VAR_LOCATION_LOC (val) = gen_rtx_SUBREG (TImode, loc, 0);
>>> +         df_insn_rescan (insn);
>>> +         return;
>>> +       }
>>> +    }
>>> +}
>>> +
>>>  /* Convert INSN from TImode to V1T1mode.  */
>>>
>>>  void
>>> @@ -3806,8 +3835,10 @@ timode_scalar_chain::convert_insn (rtx_insn *insn)
>>>         rtx tmp = find_reg_equal_equiv_note (insn);
>>>         if (tmp)
>>>           PUT_MODE (XEXP (tmp, 0), V1TImode);
>>> +       PUT_MODE (dst, V1TImode);
>>> +       fix_debug_reg_uses (dst);
>>>        }
>>> -      /* FALLTHRU */
>>> +      break;
>>>      case MEM:
>>>        PUT_MODE (dst, V1TImode);
>>>        break;
>>
>> Won't be it waste CPU cycles when there is no debug insin
>> which use TImode registers?
>>
>
> Here is the updated patch.  Tested on x86-64.  OK for trunk?

I'm OK with this version.

Thanks,
Ilya

>
> Thanks.
>
>
> --
> H.J.

Reply via email to