Hi Jiangning,

> I see your point, so do you mean we must generate fmla instruction for 
> intrinsic function vfma_lane_f32(), no matter if it is in -ffast-math mode or 
> not? Then I think we have to generate fmls for intrinsic function 
> vfms_lane_f32() as well.

I believe so.

> I don't see LLVM IR has @llvm.fms.* defined, so we have to define an aarch64 
> specific LLVM intrinsic, or we can use an expression containing llvm.fma.* to 
> represent it?

I think I worked out that it was equivalent to @lllvm.fma(-x, y, z)
(and @llvm.fma(x, -y, z)). The negation is exact, and the fusing works
out to be the same for "z + (-x)*y" as for "z - x*y".

By the way, be wary of the operand order. @llvm.fma(x,y,z) calculates
"x*y+z", but "fmla x, y, z" calculates x + y*z. I *think* both me and
Ana got that wrong at least once. I know I did.

Cheers.

Tim.
_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Reply via email to