In a message dated 7/14/2005 1:25:33 A.M. Central Daylight Time,  
[EMAIL PROTECTED] writes:

Bill  Fairchild wrote:
> However, even though it is not of much value, it is  certainly of  
interest.  
> If you really want to know how to  speed instructions up, you  must be 
prepared 
> to read lots of  highly arcane technical papers on  instruction processing 
> units,  pipelines, instruction caches, translation  lookaside buffers, data 
 
> caches, bus width, look-ahead instruction preprocessing,   multiple 
processor 
> serialization effects, instruction predecessor  relationships,  et alia.  
....
 


Or  you could use a little assembler program, using STCK or TIMEUSED, and  
execute contemplated code several hundred to several thousand times  
each, and compare the results. No reading of papers, no head scratching,  
just numbers for your environment.....
 
Right.  If I wanted to know how long instruction op code XYZ takes to  
execute, I would certainly do it the way you suggested.  Reading of papers  and 
head 
scratching would be interesting to me since I am interested in learning  how 
instruction processing takes place on a low level - in general.  But  for any 
one particular op code I would perform the experiment you  described.  I also 
once put a STCK immediately in front of and immediately  behind an instruction 
that I wanted to learn about - Store SCHIB - and found it  took something 
like 60 microseconds, which was a huge amount of time compared to  all other 
instructions.  After I saw that, I removed the Store SCHIB since  it wasn't 
necessary.  The Princ. of Ops even warns about using this  instruction a lot - 
can 
cause performance problems - must be doing some  serialization in the channel 
subsystem.  To be really, really accurate, you  must also first find out how 
much overhead you are imposing on your experiment  by using STCK and any 
looping 
instructions, so you have to test each of them  several thousand times and 
get averages.
 

One  interesting result was that one MVCL for 1K takes about as long as 
four  MVCs of 256; below that MVCs are faster on every processor I 
tested.  Another surprise (?) was that two STs were faster than an STM 
for two  registers.

 
These are surprising and interesting results.  But I would still not  be 
motivated to perform a timing experiment unless the code I was thinking  about 
optimizing was going to be executed a very large number of times per  second in 
some critical path or perhaps in a tight loop.  If I were  building a compiler, 
however, I would be concerned about trying to optimize code  execution as 
much as possible in a generalized way, which means you would not  know what 
machine the code was to be run on with individual machine  peculiarities to 
consider.  But I don't build compilers.
 
Bill Fairchild

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html

Reply via email to