Richard Biener <richard.guent...@gmail.com> writes: > On Tue, Aug 18, 2015 at 1:07 PM, David Sherwood <david.sherw...@arm.com> > wrote: >>> On Mon, Aug 17, 2015 at 11:29 AM, David Sherwood >>> <david.sherw...@arm.com> wrote: >>> > Hi Richard, >>> > >>> > Thanks for the reply. I'd chosen to add new expressions as this seemed >>> > more >>> > consistent with the existing MAX_EXPR and MIN_EXPR tree codes. In >>> > addition it >>> > would seem to provide more opportunities for optimisation than a >>> > target-specific >>> > builtin implementation would. I accept that optimisation opportunities >>> > will >>> > be more limited for strict math compilation, but that it was still >>> > worth having >>> > them. Also, if we did map it to builtins then the scalar version would go >>> > through the optabs and the vector version would go through the >>> > target's builtin >>> > expansion, which doesn't seem very consistent. >>> >>> On another note ISTR you can't associate STRICT_MIN/MAX_EXPR and thus >>> you can't vectorize anyway? (strict IEEE behavior is about NaNs, correct?) >> I thought for this particular case associativity wasn't an issue? >> We're not doing any >> reductions here, just simply performing max/min operations on each >> pair of elements >> in the vectors. I thought for IEEE-compliant behaviour we just need to >> ensure that for >> each pair of elements if one element is a NaN we return the other one. > > Hmm, true. Ok, my comment still stands - I don't see that using a > tree code is the best thing to do here. You can add fmin/max optabs > and special expansion of BUILT_IN_FMIN/MAX and you can use a target > builtin for the vectorized variant. > > The reason I am pushing against a new tree code is that we'd have an > awful lot of similar codes when pushing other flag related IL > specialities to actual IL constructs. And we still need to find a > consistent way to do that.
In this case though the new code is really the "native" min/max operation for fp, rather than some weird flag-dependent behaviour. Maybe it's a bit unfortunate that the non-strict min/max fp operation got mapped to the generic MIN_EXPR and MAX_EXPR when the non-strict version is really the flag-related modification. The STRICT_* prefix is forced by that and might make it seem like more of a special case than it really is. If you're still not convinced, how about an internal function instead of a built-in function, so that we can continue to use optabs for all cases? I'd really like to avoid forcing such a generic concept down to target-specific builtins with target-specific expansion code, especially when the same concept is exposed by target-independent code for scalars. TBH though I'm not sure why an internal_fn value (or a target-specific builtin enum value) is worse than a tree-code value, unless the limit of the tree_code bitfield is in sight (maybe it is). Thanks, Richard