> There's also the order of the additions to think about: (add (add LHS,
> RHS), rounding_bit) vs (add (add LHS, rounding_bit), RHS) and so on.
> I'll press on with the others for now, since I don't see any other
> looming issues there.

I spoke too soon. VQDMLAL has an extra saturation operation not
present in VQDMUL followed by an addition. It probably needs to stay
separate for now (there is a much more complicated pattern that could
work, but that's well beyond the scope of simple implementation for
AArch64).

So, as a summary of the 32-bit ARM status:
+ I have a patch for vaddhn & vsubhn
+ It should already support vmull, vmlal and vmlsl
+ vqdmlal and vqmlsl are almost certainly find as intrinsics for now.
+ vraddhn and vrsubhn are borderline. I'll leave the decision to you.

Cheers.

Tim.
_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Reply via email to