On Fri, Oct 18, 2013 at 12:06:35PM +0200, Richard Biener wrote: > You can't move type conversion "out of the way" in most cases as > GIMPLE is stronly typed > and data sources and sinks can obviously not be "promoted" (nor can > function arguments). > So you'll very likely not be able to remove the code from the > optimizers, it will only maybe > trigger less often.
My take on the type demotion and promotion is that we badly need it and the question is just in which pass to do it. The benefit of type demotion is code canonicalization and removing unnecessary computation that e.g. only affects the upper bits that are going to be thrown away anyway, the disadvantage of type demotion of signed operations is that we need to perform them in unsigned type instead and thus we can't perform some loop optimizations based on undefined behavior etc. See e.g. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45397#c0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45397#c1 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45397#c8 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45397#c10 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47477#c16 for some testcases where type demotion can improve generated code. If types are demoted, upper bits of constants go away, SCCVN can find equivalences between SSA_NAMEs that wouldn't be considered before, etc. But given the issue with signed operation type demotion, I think before loop optimizations we should only be doing type demotions that don't result in defining previously undefined behavior operations. I guess passes like forwprop, gimple-fold etc. could easily handle the easy cases, where there is a tree of has_single_use SSA_NAMEs that can be demoted, but handling a more complicated web would be harder. Say in: unsigned int a, b, c, d, e, f; unsigned char h, i, j; void foo (void) { unsigned int k = a * 2 + b + 0x12340000; unsigned int l = c * 4 + d + 0x23456700; unsigned int m = e * 5 + f, n = k + l - m, o = k - l + m, p = -k + 1; h = n; i = o; j = p; } k, l, m all have multiple imm uses, but still pretty much everything in this function could be demoted to unsigned char, the two large constants could go away as additions of zero, etc. Perhaps that can be seen as little benefit, but what if the above is all s/unsigned int/unsigned long long/;s/unsigned char/unsigned int/ on 32-bit target? RTL subreg pass might help a little bit, but that is too late. For the demotion which changes undefined overflow operations to defined ones, I wonder when is the last pass that usefully makes use of that information, if e.g. we could do the full type demotion already before vectorization somewhere in the loop optimization queue, or if that is still too early. Where type demotion and promotion is very important is IMHO vectorization, the code we generate for mixed types vectorization is just huge and terrible. If we can help it by not computing useless upper bits, or on the other side sometimes not doing parts of computations in smaller types, which lead to all the other computations on wider types to be done with bigger vectorization factor, we could improve generated code quality. I wonder if for vectorizations we couldn't use the same thing I wrote recently for if-conversion, for bbs potentially suitable for vectorization (with the right loop form etc.), that is, if we don't do full type demotion before vectorization, check if we'd demote anything and if so, work only on the vectorization only loop copy (or create it), and then try to do some type promotion to minimize number of type sizes in the loop, see the http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47477#c16 (admittedly artificial) testcase for what I mean. After demotion, we could replace the cast of short to char and back just with and (for zero extension) or signed shift right + shift left (for sign extension), etc. And, finally, the question is if we generate good code if we just expand RTL from the demoted types (we'd better be, because user could have written his code in the narrower types from the beginning (well, C implicit promotions make that harder, but fold-const already demotes some computations that appear in a single statement), or if there are advantages of promoting some types, what algorithm to use for that, what cost model, what target hooks etc. Jakub