[Bug target/92665] [AArch64] low lanes select not optimized out for vmlal intrinsics

pinskia at gcc dot gnu.org Mon, 25 Nov 2019 12:01:59 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92665


--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Wilco from comment #3)
> I think it's because many intrinsics in arm_neon.h still use asm which
> inhibits most optimizations.

NO in this case it is not.

Take:
#include "arm_neon.h"

float64x1_t fun(float64x2_t a, float64x2_t b) {
  return vget_low_f64(b);
}
double fun1(float64x2_t a, float64x2_t b) {
  return b[0];
}

---- CUT ----
Both of these should be optimized to just
fmov d0, d1
ret

Even worse take:
#include "arm_neon.h"

float64x1_t fun(float64x2_t a, float64x2_t b) {
  return vget_low_f64(b) + vget_high_f64(b);
}
double fun1(float64x2_t a, float64x2_t b) {
  return b[0] + b[1];
}

---- CUT ---

[Bug target/92665] [AArch64] low lanes select not optimized out for vmlal intrinsics

Reply via email to