+ There are two useful preprocessor defines for use by maintainers:
+
+ #define LOG_COSTS
+
+ if you wish to see the actual cost estimates that are being used
+ for each mode wider than word mode and the cost estimates for zero
+ extension and the shifts. This can be useful when port maintainers
+ are tuning insn rtx costs.
+
+ #define FORCE_LOWERING
+
+ if you wish to test the pass with all the transformation forced on.
+ This can be useful for finding bugs in the transformations.
Must admit I'm not keen on these kinds of macro, but it's Ian's call.
Idea for the future (i.e. not this patch) is to have a dump file for
target initialisation.
Imagine my horror when i did all of this as you had privately suggested
and discovered that there was no way to log what i was doing. This is
good enough until someone wants to fix the general problem.
+/* This pass can transform 4 different operations: move, ashift,
+ lshiftrt, and zero_extend. There is a boolean vector for move
+ splitting that is indexed by mode and is true for each mode that is
+ to have its copies split. The other three operations are only done
+ for one mode so they are only controlled by a single boolean .*/
As mentioned privately, whether this is profitable for shifts depends
to some extent on the shift amount. GCC already supports targets where
this transformation would be OK for some shift amounts but not others.
So for shifts, I think this should be an array of HOST_BITS_PER_WIDE_INT
booleans rather than just one.
More comments below about how this filters through your other changes.
I think that you actually are missing what i am doing with this. I
look at 3 representative values that "should" discover any non
uniformities. If any of them are profitable, i set this bit. Then at
the point where i really have to pull the trigger on a real instance, i
check the shift amount used at that spot to see if the individual shift
is profitable.
I did this for two reasons. One of them was that i was a little
concerned that HOST_BITS_PER_WIDE_INT on the smallest host was not as
big as the bitsize of word_word mode on the largest target (it could be
but this knowledge is above my pay grade). The other reason was did
not see this as a common operation and checking it on demand seemed like
the winner.
I will do everything else you mention and resubmit after i fix ramana's ice.