Richard Biener <richard.guent...@gmail.com> writes: > On Tue, Aug 18, 2015 at 4:15 PM, Richard Sandiford > <richard.sandif...@arm.com> wrote: >> Richard Biener <richard.guent...@gmail.com> writes: >>> On Tue, Aug 18, 2015 at 1:07 PM, David Sherwood >>> <david.sherw...@arm.com> wrote: >>>>> On Mon, Aug 17, 2015 at 11:29 AM, David Sherwood >>>>> <david.sherw...@arm.com> wrote: >>>>> > Hi Richard, >>>>> > >>>>> > Thanks for the reply. I'd chosen to add new expressions as this >>>>> > seemed more >>>>> > consistent with the existing MAX_EXPR and MIN_EXPR tree codes. In >>>>> > addition it >>>>> > would seem to provide more opportunities for optimisation than a >>>>> > target-specific >>>>> > builtin implementation would. I accept that optimisation >>>>> > opportunities will >>>>> > be more limited for strict math compilation, but that it was still >>>>> > worth having >>>>> > them. Also, if we did map it to builtins then the scalar version would >>>>> > go >>>>> > through the optabs and the vector version would go through the >>>>> > target's builtin >>>>> > expansion, which doesn't seem very consistent. >>>>> >>>>> On another note ISTR you can't associate STRICT_MIN/MAX_EXPR and thus >>>>> you can't vectorize anyway? (strict IEEE behavior is about NaNs, >>>>> correct?) >>>> I thought for this particular case associativity wasn't an issue? >>>> We're not doing any >>>> reductions here, just simply performing max/min operations on each >>>> pair of elements >>>> in the vectors. I thought for IEEE-compliant behaviour we just need to >>>> ensure that for >>>> each pair of elements if one element is a NaN we return the other one. >>> >>> Hmm, true. Ok, my comment still stands - I don't see that using a >>> tree code is the best thing to do here. You can add fmin/max optabs >>> and special expansion of BUILT_IN_FMIN/MAX and you can use a target >>> builtin for the vectorized variant. >>> >>> The reason I am pushing against a new tree code is that we'd have an >>> awful lot of similar codes when pushing other flag related IL >>> specialities to actual IL constructs. And we still need to find a >>> consistent way to do that. >> >> In this case though the new code is really the "native" min/max operation >> for fp, rather than some weird flag-dependent behaviour. Maybe it's >> a bit unfortunate that the non-strict min/max fp operation got mapped >> to the generic MIN_EXPR and MAX_EXPR when the non-strict version is really >> the flag-related modification. The STRICT_* prefix is forced by that and >> might make it seem like more of a special case than it really is. > > In some sense. But the "strict" version already has a builtin (just no > special expander in builtins.c). We usually don't add 1:1 tree codes > for existing builtins (why have builtins at all then?).
We still need the builtin to match the C function (and to allow direct calls to __builtin_fmin, etc., which are occasionally useful). >> If you're still not convinced, how about an internal function instead >> of a built-in function, so that we can continue to use optabs for all >> cases? I'd really like to avoid forcing such a generic concept down to >> target-specific builtins with target-specific expansion code, especially >> when the same concept is exposed by target-independent code for scalars. > > The target builtin is for the vectorized variant - not all targets might have > that and we'd need to query the target about this. So using a IFN would > mean adding a target hook for that query. No, the idea is that if we have a tree code or an internal function, the decision about whether we have target support is based on a query of the optabs (just like it is for scalar, and for other vectorisable tree codes). No new hooks are needed. The patch checked for target support that way. > > TBH though I'm not sure why an internal_fn value (or a target-specific > > builtin enum value) is worse than a tree-code value, unless the limit > > of the tree_code bitfield is in sight (maybe it is). > > I think tree_code is 64bits now. Even better :-) Thanks, Richard