Hi Tim, int_arm_neon_vaddhn/int_arm_neon_vmulls and friends are all defined by ARM target. I agree with your comments, but does it imply ARM back-end has inappropriate implementation?
Thanks, -Jiangning 2013/8/26 Tim Northover <[email protected]> > Hi Jiangning, > > I've just looked at the LLVM patch for now, since the comments may > drastically change the Clang patch. > > + def _8h8b > [...] > + def _8H > > It would be nice to settle on a single naming convention for these > instructions. Personally, I think I prefer the first, but I don't have > a strong opinion either way. > > +defm SADDWvvv : NeonI_3VDW_s<0b0, 0b0001, "saddw", add, 1>; > +defm UADDWvvv : NeonI_3VDW_u<0b1, 0b0001, "uaddw", add, 1>; > + > +defm SADDW2vvv : NeonI_3VDW2_s<0b0, 0b0001, "saddw2", add, 1>; > +defm UADDW2vvv : NeonI_3VDW2_u<0b1, 0b0001, "uaddw2", add, 1>; > > I don't think any widening instructions are commutable. The addition > part is, but the widening only happens to the RHS. You can't swap Rn > and Rm on the instructions and get the same result. > > +defm ADDHNvvv : NeonI_3VDN_2Op<0b0, 0b0100, "addhn", > int_arm_neon_vaddhn, 1>; > +defm RADDHNvvv : NeonI_3VDN_2Op<0b1, 0b0100, "raddhn", > int_arm_neon_vraddhn, 1>; > > Don't these have reasonably simple LLVM IR representations? For example: > > define <2 x i32> @addhn(<2 x i64> %lhs, <2 x i64> %rhs) { > %sum = add <2 x i64> %lhs, %rhs > %shift = shl <2 x i64> %sum, <i64 32, i64 32> > %trunc = trunc <2 x i64> %shift to <2 x i32> > ret <2 x i32> %trunc > } > > define <2 x i32> @raddhn(<2 x i64> %lhs, <2 x i64> %rhs) { > %sum = add <2 x i64> %lhs, %rhs > %rounded = add <2 x i64> %sum, <i64 0x80000000, i64 0x80000000> > %shift = shl <2 x i64> %rounded, <i64 32, i64 32> > %trunc = trunc <2 x i64> %shift to <2 x i32> > ret <2 x i32> %trunc > } > > +defm SMULLvvv : NeonI_3VDL_2Op<0b0, 0b1100, "smull", > int_arm_neon_vmulls, 1>; > +defm UMULLvvv : NeonI_3VDL_2Op<0b1, 0b1100, "umull", > int_arm_neon_vmullu, 1>; > > Aren't these even simpler than addhn and friends? An extend followed > by a multiply? They're also always commutable so it probably doesn't > need to be a template parameter (same for sabdl and uabdl). > > +defm SQDMLALvvv : NeonI_3VDL_3Op_v2<0b0, 0b1001, "sqdmlal", > + int_arm_neon_vqdmlal>; > > The qdmlals are just qdmulls with an extra addition, I think. > > Cheers. > > Tim. > -- Thanks, -Jiangning
_______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
