There is a new option now, especially with non-zero codes:

LHI  R15,4

No storage fetch.

The subject of instruction timings on IBM-MAIN and ASSEMBLER-LIST comes up
now and then.  I point y'all to the archives of both lists.  With the new
z/Architecture pipelines and caches, sometimes what seems at first to be
illogical instruction placement may actually be better.  Hypothetical
illustration example:


L    R4,RECPTR    Load address of pointer
AHI  R6,1         Add 1 to counter
AHI  R8,(-8)      Some other strange counter
CLI  16(R4),X'40'
JE   GOHERE

The z/Architecture processor will execute the two AHI instructions while the
base/displacement calculation and storage access for the L instruction is
occurring, because it knows that R4 isn't affected by those instructions.
By the time the CLI is hit R4 will contain the address and there is no delay
that might occur if you code

AHI  R6,1         Add 1 to counter
AHI  R8,(-8)      Some other strange counter
L    R4,RECPTR    Load address of pointer
CLI  16(R4),X'40'
JE   GOHERE

In this case, there might be a delay at the CLI.

Speaking of branches there's been an interesting discussion recently about
the branch-prediction logic in z/Architecture, which is why I demonstrate
with the R&I (or is it I&R? I can never remember) instruction.

Later,
Ray

-----Original Message-----
From: The IBM z/VM Operating System [mailto:[EMAIL PROTECTED] On
Behalf Of Mike Walter
Sent: Monday December 04 2006 12:02
To: IBMVM@LISTSERV.UARK.EDU
Subject: Re: CMSCALL return code

Sheesh, this goes way back to my good old Assembler diaper days when
programmers really cared about performance instead of drag and drop
solutions.
Slightly off-topic: if I remember correctly, we argued intensely about
zeroing a GPR and the performance differences between: 

- SR R15,R15
- XR R15,R15
- LA R15,0    (not seriously considered by performance geeks)
- L R15,=F'0' (considered for use only by amateur programmers coming from a
BASIC or COBOL background and otherwise held in low esteem by "real
programmers").  ;-)

IIRC, the actual performance difference between SR and XR was different
based more on specific processor models that anything else.

Mike Walter
Hewitt Associates
Any opinions expressed herein are mine alone and do not necessarily
represent the opinions or policies of Hewitt Associates.




"Schuh, Richard" <[EMAIL PROTECTED]> 

Sent by: "The IBM z/VM Operating System" <IBMVM@LISTSERV.UARK.EDU>
12/04/2006 11:37 AM
Please respond to
"The IBM z/VM Operating System" <IBMVM@LISTSERV.UARK.EDU>



To
IBMVM@LISTSERV.UARK.EDU
cc

Subject
Re: CMSCALL return code






True, and it is undoubtedly faster to use SR  R15,R15 than it is to use LA
R15,0 to zero the register - there are no storage fetches and real
subtraction is not needed if the result can be predicted, as it can in this
case. However, the discussion had more to do with fetches of
boundary-aligned vs. non-aligned data. There was no mention of the optimum
speed for getting either a specific or an arbitrary value loaded into a
register. In this day of pipelined machines
 
This is sort of reminiscent of the good old days, programming in 7080
Autocoder. Boeing insisted that the programmers use a MOVE macro because
there were 26 different ways to move data from one storage location to
another. It was expected that most programmers would use either their
favorite way or the first one that popped into their heads if left on their
own. The macro chose the optimal way, depending on the operand definitions.

From: The IBM z/VM Operating System [mailto:[EMAIL PROTECTED] On
Behalf Of Stanley Rarick
Sent: Friday, December 01, 2006 10:37 PM
To: IBMVM@LISTSERV.UARK.EDU
Subject: Re: CMSCALL return code

For a return code, LA R15,value is *much* faster than a L - only one storage
fetch.

Schuh, Richard wrote:
I really would not have left it to chance, I would have defined a
word-aligned constant rather than using a literal. However, it might not
have been as chancy as it may seem. The literal pool is doubleword aligned
and boundary alignment may have been a factor in determining where the
literal resided. I would like to think that the 8-byte multiples are put at
the front, the 4-byters next, then the twos followed by everybody else. In
looking at an assembly listing, that seems to be the sequence. The first two
literals in the program are =x'0000A00', the next =x'FF', etc. In the
literal pool, all 4 byte entries (there were no 8 byte literals) precede the
two byte literals and then come the ones of only 1 byte. Within each of
these groups, the literals appear in the order in which they were defined.
There were no long strings defined as literals in the particular listing. 

-----Original Message-----
From: The IBM z/VM Operating System [mailto:[EMAIL PROTECTED] On
Behalf Of Don Russell
Sent: Tuesday, November 21, 2006 3:46 PM
To: IBMVM@LISTSERV.UARK.EDU
Subject: Re: CMSCALL return code

Schuh, Richard wrote:
 
I agree, it does seem non-intuitive. The initial SR   R15,R15 was
undoubtedly preparing for a default rc of zero. How the non-zero rc gets put
into the register later is largely a matter of taste. In this
 
case I
 
probably would have chosen L   R15,=X'...' - a habit learned, when
machines were slower, based on the knowledge that they were mostly optimized
for the LOAD instruction vs. any other way of putting data from memory into
a register.
 
 

If your habit was to use L Rx,=X'...' you were probably lucky in the old
days.... the =X literal would not necessarily be word-aligned, causing
two fetches to load the register, or, in the days when alignment really
mattered... a program exception.

Better to use L R15,=A(X'...') if alignment is a concern and you want to
use literals.

Then the literal IS aligned on a fullword boundary.

The initial SR 15,15 is unlikely to be setting the default return code..
.it's clearing the register preparing for the different option bytes to
be OR'd in. I agree the macro could (should?) have generated a single L
instruction instead, but then what nits would we have to discuss? :-)

 


 
The information contained in this e-mail and any accompanying documents may
contain information that is confidential or otherwise protected from
disclosure. If you are not the intended recipient of this message, or if
this message has been addressed to you in error, please immediately alert
the sender by reply e-mail and then delete this message, including any
attachments. Any dissemination, distribution or other use of the contents of
this message by anyone other than the intended recipient 
is strictly prohibited.

Reply via email to