Re: Out of Order and Superscalar - small experiment

Rob van der Heij Tue, 03 Jun 2014 02:02:29 -0700

On 3 June 2014 09:19, Robin Vowels <[email protected]> wrote:

> On 2 June 2014 19:56, Robin Vowels <[email protected]> wrote:
>
>> From: "Tony Harminc" <[email protected]>
>>> Sent: Tuesday, June 03, 2014 3:30 AM
>>>
>>
>>  Is LHI Rn,0 faster than SR Rn,Rn? I'd expect them to be the same, but
>>>> SR is half the size, and so lessens the amount of i-cache used.
>>>>
>>>
>>> XR Rn,Rn is faster than SR.
>>>
>>
>> I doubt it. Perhaps on much older machines, but I have little doubt
>> that modern machines have special cases for most or all common ways of
>> zeroing a register, e.g. SR Rn, Rn and XR Rn, Rn and LA Rn,0 and LHI
>> Rn,0 .
>>
>
> Longer instructions tend to run slower.
>
>
We were asking about your claim that "XR is faster than SR" but both are
RR-type.  The best I know is that very long ago the SR was faster than XR,
but with later machines it was not anymore. That message may have been
extrapolated in that now XR would be faster than SR, but I have not see
evidence of that. Someone suggested that LA would be done in address adders
and have an advantage over LHI, even to the point of outweighing the extra
two bytes over a SR.

You're most welcome to borrow this thread to discuss another algorithm, but
maybe it's confusing enough already. If the algorithm were not relevant, I
could have used the CKSM as pointed out. I just wanted to share my
experience where the CPU did a better job in reordering instructions than
the compiler. That surprised me. Considering the audience, I provided some
background for those who care.

My simple version did not require wiping the register before the IC, and I
did not code that either. The compiler generated code does require it, and
I suppose that is why it was slower.

Rob

Re: Out of Order and Superscalar - small experiment

Reply via email to