https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62286

--- Comment #3 from Ramana Radhakrishnan <ramana at gcc dot gnu.org> ---
(In reply to ktkachov from comment #2)
> (In reply to Ramana Radhakrishnan from comment #1)
> > Because the Cortex-M3 doesn't have those instructions ? It's a testism
> > probably fixed by an appropriate dg-options values.
> 
> It's not a testism, it's a costs issue.
> The FP instructions are dictated by the -mfpu option that is given
> (-mfpu=vfp is hardcoded in the dg-options here) and in any case Cortex-M3
> should support the vmla instructions as far as I know.
> The RTX costs during combine reject the combination of
> 
>          vnmul.f32       s15, s14, s15
>          vsub.f32        s15, s15, s13
> 
> into 
>          vnmla.f32       s15, s13, s14
> 
> for example.
> In particular I think it's the mult_addsub cost. A relevant combine log part
> is:
> Trying 57 -> 58:
> Successfully matched this instruction:
> (set (reg:SF 134 [ D.4322 ])
>     (plus:SF (mult:SF (reg:SF 130 [ D.4322 ])
>             (reg:SF 131 [ D.4322 ]))
>         (reg:SF 133 [ D.4322 ])))
> (plus:SF (mult:SF (reg:SF 130 [ D.4322 ])
>         (reg:SF 131 [ D.4322 ]))
>     (reg:SF 133 [ D.4322 ]))
> 
> Hot cost: 24 (final)
> rejecting combination of insns 57 and 58
> original costs 12 + 8 = 20
> replacement cost 24
> 
> Is it actually beneficial for Cortex-M3 to split this up?

Well, there is no M3 with an FPU and this whole discussion is moot. I don't
think the costs should be changed for this if they reflect reality i.e. the
cost of a libcall for the multiply and cost of a libcall for addition !




Ramana

Reply via email to