Re: Replace FMA_EXPR with one internal fn per optab

Richard Sandiford Fri, 11 May 2018 23:53:02 -0700

Andrew Pinski <pins...@gmail.com> writes:
> On Fri, May 11, 2018 at 11:04 AM, Richard Sandiford
> <richard.sandif...@linaro.org> wrote:
>> Andrew Pinski <pins...@gmail.com> writes:
>>> On Fri, May 11, 2018 at 10:15 AM, Richard Sandiford
>>> <richard.sandif...@linaro.org> wrote:
>>>> There are four optabs for various forms of fused multiply-add:
>>>> fma, fms, fnma and fnms.  Of these, only fma had a direct gimple
>>>> representation.  For the other three we relied on special pattern-
>>>> matching during expand, although tree-ssa-math-opts.c did have
>>>> some code to try to second-guess what expand would do.
>>>>
>>>> This patch removes the old FMA_EXPR representation of fma and
>>>> introduces four new internal functions, one for each optab.
>>>> IFN_FMA is tied to BUILT_IN_FMA* while the other three are
>>>> independent directly-mapped internal functions.  It's then
>>>> possible to do the pattern-matching in match.pd and
>>>> tree-ssa-math-opts.c (via folding) can select the exact
>>>> FMA-based operation.
>>>>
>>>> The patch removes the gimple FE support for __FMA rather than mapping
>>>> it to the internal function.  There's no reason now to treat it
>>>> differently from other internal functions (although the FE doesn't
>>>> handle those yet).
>>>>
>>>> The BRIG & HSA parts are a best guess, but seem relatively simple.
>>>>
>>>> The genmatch.c changes are structured to allow ternary ops in which
>>>> the second two rather than the first two operands are commutative.
>>>> A later patch makes use of this.
>>>>
>>>> Tested on aarch64-linux-gnu (with and without SVE), aarch64_be-elf,
>>>> x86_64-linux-gnu and powerpc64le-linux-gnu.  OK to install?
>>>
>>>
>>> I think there might be an issue with long double and __float128
>>> support here (for both PowerPC and x86_64).  Please add testcases for
>>> those to show they are not problematic.
>>> What about half float on the aarch64 case?  Is that handle correctly?
>>> I did not see a testcase for that case either.
>>
>> What specific kind of problem are you thinking of?  The patch is
>> type-generic and the internal functions are only used if the target
>> advertises the appropriate optab.  Targets already check (or should
>> check) that the optabs are used under the appropriate conditions for
>> that target.  E.g. gcc.target/powerpc/float128-fma1.c checks for the
>> four cases of __float128 for PowerPC.
>
> It was more of reference to the documentation addition you did:
> "+Target supports all four fused multiply-add optabs for both @code{float}
> +and @code{double}."


Ah, OK.  That was just for the new testsuite target selector.  It had
to be defined relative to *some* types, and I thought in practice any
target that implements these optabs will implement at least float and
double, so those were the best types to use in the new tests.

The tests weren't meant to cover every possible type, and I guess
there's an argument that the float tests are redundant given the double
ones.  At least having both does prove some level of type genericity though.

> Also a side note, while I was working improving the use of integer
> madd instructions on aarch64, I found if I changed "madd<mode>"
> pattern name to "fma<mode>" I could get more madd used (and add with
> shifts too).  Does the FMA internal function still support integer
> types?

Yeah, if the optab is defined for the corresponding integer modes.

Not sure whether that's a good or a bad thing :-), but at least it's
consistent with how FMA_EXPR worked.

Thanks,
Richard

Re: Replace FMA_EXPR with one internal fn per optab

Reply via email to