When looking for a specific byte a CLI loop is my weapon of choice.
Unless I'm dealing with frequently executed code I'm happy to simply
embrace the TRT instruction along with the other unabashedly
CISC-to-the-max members of the z/Architecture instruction set.
Ultimately you just have to experiment because you're optimizing for a
black-box and the only instrumentation available is CPU time used.
Instruction order, branch points and branch target locations can make
big differences so it's worth trying various combinations to see what
works best (putting the first instruction of a tight loop on a
doubleword boundary, for example).

Keven

-----Original Message-----
From: IBM Mainframe Assembler List
[mailto:ASSEMBLER-LIST@LISTSERV.UGA.EDU] On Behalf Of Rob van der Heij
Sent: Thursday, January 12, 2012 7:16 PM
To: ASSEMBLER-LIST@LISTSERV.UGA.EDU
Subject: Re: How bad is the EX instruction?

On Fri, Jan 13, 2012 at 1:05 AM, Hall, Keven <keh...@informatica.com>
wrote:

> If you're looking to reduce CPU usage you might want to optimize the
> TRT the heck out of the equation.  Talk about expensive!  [augment
> with imagined or actual sound of cash register "cah-ching" sound for
> added emphasis/effect]

Ok...  but how?  Would a loop stepping over the max 8 bytes be wiser to
find the first blank? Another idea I had was to step a 2-byte CLC with
'* ' over the string, but the complexity and the end spoils the fun.

Guess I never really measured TRT.   A variation of this code is used
to search items in a linked list. I obviously moved the TRT out of the
loop and that might have helped make it faster.

Rob

Reply via email to