On Thu, Aug 12, 2010 at 8:51 AM, McKown, John <[email protected] > wrote:
> New instructions do not necessarily run more efficiently than previous > instructions. It depends on what the instructions do, of course. I have not > tested it myself, but I've been told that on some processors, the MVCL > instruction is actually slower than doing a corresponding loop using MVC. > And I wonder if MVCLE is more efficient than MVCL. I also remember when IBM > went from BiPolar machines (3090?) to CMOS (?). The packed decimal > instructions performed dismally. > > Now, some of the recent z10 and above sure sound like they are more > efficient. Such as using the new compare and branch instructions instead of > the separate compare & branch instructions. But not always. If you need a > "two way" compare, then "compare and branch" makes sense. But if you need a > "three way" compare (such as compare against zero, branch one place if > negative, another place if zero, and next instruction if positive), then a > separate compare followed by 2 branch instructions might be more efficient > than two "compare and branch" instructions. But without a z10 to test on, I > don't know that for certain. > > Now, the OS level you're running has nothing to do with or influence the > efficiency of the instructions in your program. So if you get a z10 or z196 > and start using the new instructions, then you get the benefits of the new > instructions. If an old instruction has its execution improved, then you get > that improvement in your code. If an old instruction executes slower on the > new processor, then your code will suffer. > > I think your general question has an answer of "false". Basically the new > CPU is "faster" because of a better cycle time and because more efficient > hardware or millicode. The two together are use to calculate the "MIPS" of > the machine. Of course MIPS and even MSUs are now only marketting propoganda > with little technical meaning. > Indeed. This is a great post. I'd add that things like pipelining and out-of-order execution and the like can have large and unintuitive effects. I remember some old IBM code (EXEC 2 source on VM) that had comments like "do this while R2 settles"; that surely held true when the code was written, and for at least ten minutes after that. Now on zEnterprise, the hardware makes those decisions for you -- says "OK, he's doing a load into R2, then an add to R2, then an unrelated load into R3 -- I can swap the last two instructions and make it faster without changing the result" (no, I don't claim that's a real example, just the kind of thing that OOO execution enables). So a LR instruction on a 4.4GHz z10 *might* be 15% slower than on a 5.2GHz zEnterprise, or it might be 20% or 10% slower. And thanks to OOO execution, that will even vary from LR to LR. The good news is, this is a performance question, so an answer of "it depends" is traditional and thus appropriate! -- zMan -- "I've got a mainframe and I'm not afraid to use it" ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [email protected] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html

