In a message dated 7/14/2005 1:25:33 A.M. Central Daylight Time, [EMAIL PROTECTED] writes:
Bill Fairchild wrote: > However, even though it is not of much value, it is certainly of interest. > If you really want to know how to speed instructions up, you must be prepared > to read lots of highly arcane technical papers on instruction processing > units, pipelines, instruction caches, translation lookaside buffers, data > caches, bus width, look-ahead instruction preprocessing, multiple processor > serialization effects, instruction predecessor relationships, et alia. .... Or you could use a little assembler program, using STCK or TIMEUSED, and execute contemplated code several hundred to several thousand times each, and compare the results. No reading of papers, no head scratching, just numbers for your environment..... Right. If I wanted to know how long instruction op code XYZ takes to execute, I would certainly do it the way you suggested. Reading of papers and head scratching would be interesting to me since I am interested in learning how instruction processing takes place on a low level - in general. But for any one particular op code I would perform the experiment you described. I also once put a STCK immediately in front of and immediately behind an instruction that I wanted to learn about - Store SCHIB - and found it took something like 60 microseconds, which was a huge amount of time compared to all other instructions. After I saw that, I removed the Store SCHIB since it wasn't necessary. The Princ. of Ops even warns about using this instruction a lot - can cause performance problems - must be doing some serialization in the channel subsystem. To be really, really accurate, you must also first find out how much overhead you are imposing on your experiment by using STCK and any looping instructions, so you have to test each of them several thousand times and get averages. One interesting result was that one MVCL for 1K takes about as long as four MVCs of 256; below that MVCs are faster on every processor I tested. Another surprise (?) was that two STs were faster than an STM for two registers. These are surprising and interesting results. But I would still not be motivated to perform a timing experiment unless the code I was thinking about optimizing was going to be executed a very large number of times per second in some critical path or perhaps in a tight loop. If I were building a compiler, however, I would be concerned about trying to optimize code execution as much as possible in a generalized way, which means you would not know what machine the code was to be run on with individual machine peculiarities to consider. But I don't build compilers. Bill Fairchild ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html