On 2 June 2014 20:14, Robin Vowels <[email protected]> wrote: > From: "Rob van der Heij" <[email protected]> > Sent: Tuesday, June 03, 2014 1:00 AM
>> More recently I've been working on porting Linux gcc object code to CMS, >> and now that I needed a nice checksum routine, I figured I might take a >> popular open source checksum routine http://en.wikipedia.org/wiki/Adler-32 >> and let gcc compile and optimize it. Since the generated assembler source >> wasn't that obvious to me, I was getting interested to know why. >> >> My simplistic implementation was like this (for each byte, so wrapped in a >> loop) >> >> >> * IC R4,0(R6) AR R2,R4 AR R3,R2 * > > Must have muissed something here. I think what you missed was the reference to the Adler-32 algorithm, with its need to keep two 16-bit sums. > A 3-instruction loop to sum bytes. > > LA 6,X+offset (last byte of area to be summed) > SR 2,2 > SR 4,4 > Loop IC 4,0(0,6) > AR 2,4 > BCT 6,Loop > > And you can use BCTR to save a few µS. Why do you think BCTR would save such a large amount of time? Perhaps you're again talking about old machines. Surely BRCT/JCT would be the time saver on a current machine if there is one for this case. Tony H.
