Kenneth Zadeck <zad...@naturalbridge.com> writes:
>>> +   There are two useful preprocessor defines for use by maintainers:
>>> +
>>> +   #define LOG_COSTS
>>> +
>>> +   if you wish to see the actual cost estimates that are being used
>>> +   for each mode wider than word mode and the cost estimates for zero
>>> +   extension and the shifts.   This can be useful when port maintainers
>>> +   are tuning insn rtx costs.
>>> +
>>> +   #define FORCE_LOWERING
>>> +
>>> +   if you wish to test the pass with all the transformation forced on.
>>> +   This can be useful for finding bugs in the transformations.
>> Must admit I'm not keen on these kinds of macro, but it's Ian's call.
>>
>> Idea for the future (i.e. not this patch) is to have a dump file for
>> target initialisation.
> Imagine my horror when i did all of this as you had privately suggested 
> and discovered that there was no way to log what i was doing.  This is 
> good enough until someone wants to fix the general problem.
>
>>> +/* This pass can transform 4 different operations: move, ashift,
>>> +   lshiftrt, and zero_extend.  There is a boolean vector for move
>>> +   splitting that is indexed by mode and is true for each mode that is
>>> +   to have its copies split.  The other three operations are only done
>>> +   for one mode so they are only controlled by a single boolean  .*/
>> As mentioned privately, whether this is profitable for shifts depends
>> to some extent on the shift amount.  GCC already supports targets where
>> this transformation would be OK for some shift amounts but not others.
>> So for shifts, I think this should be an array of HOST_BITS_PER_WIDE_INT
>> booleans rather than just one.
>>
>> More comments below about how this filters through your other changes.
> I think that you actually are missing what i am doing with this.   I 
> look at 3 representative values that "should" discover any non 
> uniformities.   If any of them are profitable, i set this bit.  Then at 
> the point where i really have to pull the trigger on a real instance, i 
> check the shift amount used at that spot to see if the individual shift 
> is profitable.

No, I got that.  I just think it's an unnecessary complication.

> I did this for two reasons.   One of them was that i was a little 
> concerned that HOST_BITS_PER_WIDE_INT on the smallest host was not as 
> big as the bitsize of word_word mode on the largest target (it could be 
> but this knowledge is above my pay grade).

Ah, yes, sorry, I meant an array of BITS_PER_WORD booleans.  I had
HOST_WIDE_INT on the brain after Mike's patch.

> The other reason was did not see this as a common operation and
> checking it on demand seemed like the winner.

But (at least after the other changes I mentioned) these overall
booleans cut out only a very small portion of find_decomposable_shift_zext.
I.e.:

  op = SET_SRC (set);
  if (GET_CODE (op) != ASHIFT
      && GET_CODE (op) != LSHIFTRT
      && GET_CODE (op) != ZERO_EXTEND)  <-- unified booleans checked here
    return 0;

  op_operand = XEXP (op, 0);
  if (!REG_P (SET_DEST (set)) || !REG_P (op_operand)
      || HARD_REGISTER_NUM_P (REGNO (SET_DEST (set)))
      || HARD_REGISTER_NUM_P (REGNO (op_operand))
      || !SCALAR_INT_MODE_P (GET_MODE (op)))
    return 0;

  if (GET_CODE (op) == ZERO_EXTEND)
    {
      if (GET_MODE (op_operand) != word_mode
          || GET_MODE_BITSIZE (GET_MODE (op)) != 2 * BITS_PER_WORD)
        return 0;
    }
  else /* left or right shift */
    {
      <--- specific booleans checked here
      if (!CONST_INT_P (XEXP (op, 1))
          || INTVAL (XEXP (op, 1)) < BITS_PER_WORD
          || GET_MODE_BITSIZE (GET_MODE (op_operand)) != 2 * BITS_PER_WORD)
        return 0;
    }

It seems better (and simpler) not to prejudge which shift amounts are
interesting and instead cache the "win or no win" flag for each value.

As I say, this is all in the context of this pass not being interesting
for modes where the split move is strictly more expensive than the
unified move, regardless of shift & zext costs.

Richard

Reply via email to