> There's also the order of the additions to think about: (add (add LHS, > RHS), rounding_bit) vs (add (add LHS, rounding_bit), RHS) and so on. > I'll press on with the others for now, since I don't see any other > looming issues there.
I spoke too soon. VQDMLAL has an extra saturation operation not present in VQDMUL followed by an addition. It probably needs to stay separate for now (there is a much more complicated pattern that could work, but that's well beyond the scope of simple implementation for AArch64). So, as a summary of the 32-bit ARM status: + I have a patch for vaddhn & vsubhn + It should already support vmull, vmlal and vmlsl + vqdmlal and vqmlsl are almost certainly find as intrinsics for now. + vraddhn and vrsubhn are borderline. I'll leave the decision to you. Cheers. Tim. _______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
