Kenneth Zadeck <zad...@naturalbridge.com> writes: >>> + There are two useful preprocessor defines for use by maintainers: >>> + >>> + #define LOG_COSTS >>> + >>> + if you wish to see the actual cost estimates that are being used >>> + for each mode wider than word mode and the cost estimates for zero >>> + extension and the shifts. This can be useful when port maintainers >>> + are tuning insn rtx costs. >>> + >>> + #define FORCE_LOWERING >>> + >>> + if you wish to test the pass with all the transformation forced on. >>> + This can be useful for finding bugs in the transformations. >> Must admit I'm not keen on these kinds of macro, but it's Ian's call. >> >> Idea for the future (i.e. not this patch) is to have a dump file for >> target initialisation. > Imagine my horror when i did all of this as you had privately suggested > and discovered that there was no way to log what i was doing. This is > good enough until someone wants to fix the general problem. > >>> +/* This pass can transform 4 different operations: move, ashift, >>> + lshiftrt, and zero_extend. There is a boolean vector for move >>> + splitting that is indexed by mode and is true for each mode that is >>> + to have its copies split. The other three operations are only done >>> + for one mode so they are only controlled by a single boolean .*/ >> As mentioned privately, whether this is profitable for shifts depends >> to some extent on the shift amount. GCC already supports targets where >> this transformation would be OK for some shift amounts but not others. >> So for shifts, I think this should be an array of HOST_BITS_PER_WIDE_INT >> booleans rather than just one. >> >> More comments below about how this filters through your other changes. > I think that you actually are missing what i am doing with this. I > look at 3 representative values that "should" discover any non > uniformities. If any of them are profitable, i set this bit. Then at > the point where i really have to pull the trigger on a real instance, i > check the shift amount used at that spot to see if the individual shift > is profitable.
No, I got that. I just think it's an unnecessary complication. > I did this for two reasons. One of them was that i was a little > concerned that HOST_BITS_PER_WIDE_INT on the smallest host was not as > big as the bitsize of word_word mode on the largest target (it could be > but this knowledge is above my pay grade). Ah, yes, sorry, I meant an array of BITS_PER_WORD booleans. I had HOST_WIDE_INT on the brain after Mike's patch. > The other reason was did not see this as a common operation and > checking it on demand seemed like the winner. But (at least after the other changes I mentioned) these overall booleans cut out only a very small portion of find_decomposable_shift_zext. I.e.: op = SET_SRC (set); if (GET_CODE (op) != ASHIFT && GET_CODE (op) != LSHIFTRT && GET_CODE (op) != ZERO_EXTEND) <-- unified booleans checked here return 0; op_operand = XEXP (op, 0); if (!REG_P (SET_DEST (set)) || !REG_P (op_operand) || HARD_REGISTER_NUM_P (REGNO (SET_DEST (set))) || HARD_REGISTER_NUM_P (REGNO (op_operand)) || !SCALAR_INT_MODE_P (GET_MODE (op))) return 0; if (GET_CODE (op) == ZERO_EXTEND) { if (GET_MODE (op_operand) != word_mode || GET_MODE_BITSIZE (GET_MODE (op)) != 2 * BITS_PER_WORD) return 0; } else /* left or right shift */ { <--- specific booleans checked here if (!CONST_INT_P (XEXP (op, 1)) || INTVAL (XEXP (op, 1)) < BITS_PER_WORD || GET_MODE_BITSIZE (GET_MODE (op_operand)) != 2 * BITS_PER_WORD) return 0; } It seems better (and simpler) not to prejudge which shift amounts are interesting and instead cache the "win or no win" flag for each value. As I say, this is all in the context of this pass not being interesting for modes where the split move is strictly more expensive than the unified move, regardless of shift & zext costs. Richard