https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88013

--- Comment #6 from krux <hoganmeier at gmail dot com> ---
-mfloat-abi=hard was missing indeed. It's a pity there's no warning like when
trying to use the intrinsics.

Still I see a lot more instructions, maybe that got fixed after v7.2?
https://godbolt.org/z/OWzgXi

  vld3.8 {d16, d18, d20}, [r3]
  add ip, r3, #24
  add lr, lr, #1
  add r3, r3, #48
  cmp lr, r5
  vld3.8 {d17, d19, d21}, [ip]
  vmovl.u8 q5, d16
  vmovl.u8 q15, d18
  vmovl.u8 q11, d17
  vmovl.u8 q4, d19
  vmovl.u8 q0, d20
  vmovl.u8 q1, d21
  vmull.s16 q6, d10, d28
  vmull.s16 q3, d22, d28
  vmull.s16 q2, d30, d26
  vmull.s16 q11, d23, d29
  vmull.s16 q15, d31, d27
  vmull.s16 q5, d11, d29
  vmull.s16 q9, d8, d26
  vmull.s16 q8, d9, d27
  vadd.i32 q2, q6, q2
  vadd.i32 q10, q5, q15
  vadd.i32 q9, q3, q9
  vmull.s16 q15, d0, d24
  vadd.i32 q8, q11, q8
  vmull.s16 q3, d2, d24
  vmull.s16 q0, d1, d25
  vmull.s16 q1, d3, d25
  vadd.i32 q11, q2, q15
  vadd.i32 q9, q9, q3
  vadd.i32 q10, q10, q0
  vadd.i32 q8, q8, q1
  vshr.s32 q11, q11, #8
  vshr.s32 q9, q9, #8
  vshr.s32 q10, q10, #8
  vshr.s32 q8, q8, #8
  vmovn.i32 d30, q11
  vmovn.i32 d31, q10
  vmovn.i32 d20, q9
  vmovn.i32 d21, q8
  vmovn.i16 d16, q15
  vmovn.i16 d17, q10
  vst1.8 {q8}, [r4]

Reply via email to