Hi Bernd,

On 22/01/16 14:53, Bernd Schmidt wrote:
On 01/22/2016 10:52 AM, Kyrill Tkachov wrote:

AFAICT the new sequence is better than the old one even for
-mtune=cortex-a9 since it contains two fewer instructions.

Just curious (I think this patch series is good but will leave it to the arm 
folks) - are these instructions equally expensive? Some CPUs are faster when 
doing widening multiplies on smaller objects.


The widening multiplies are indeed faster on some targets (which is why we want 
to keep them in the wmul-[12].c tests).
But for wmul-3.c the new sequence uses fewer instructions. So, while the 
resulting sequences should be
of similar performance overall, the new sequence has a smaller code size.

Kyrill


Bernd

Reply via email to