On Thu, Aug 12, 2010 at 8:51 AM, McKown, John <[email protected]
> wrote:

> New instructions do not necessarily run more efficiently than previous
> instructions. It depends on what the instructions do, of course. I have not
> tested it myself, but I've been told that on some processors, the MVCL
> instruction is actually slower than doing a corresponding loop using MVC.
> And I wonder if MVCLE is more efficient than MVCL. I also remember when IBM
> went from BiPolar machines (3090?) to CMOS (?). The packed decimal
> instructions performed dismally.
>
> Now, some of the recent z10 and above sure sound like they are more
> efficient. Such as using the new compare and branch instructions instead of
> the separate compare & branch instructions. But not always. If you need a
> "two way" compare, then "compare and branch" makes sense. But if you need a
> "three way" compare (such as compare against zero, branch one place if
> negative, another place if zero, and next instruction if positive), then a
> separate compare followed by 2 branch instructions might be more efficient
> than two "compare and branch" instructions. But without a z10 to test on, I
> don't know that for certain.
>
> Now, the OS level you're running has nothing to do with or influence the
> efficiency of the instructions in your program. So if you get a z10 or z196
> and start using the new instructions, then you get the benefits of the new
> instructions. If an old instruction has its execution improved, then you get
> that improvement in your code. If an old instruction executes slower on the
> new processor, then your code will suffer.
>
> I think your general question has an answer of "false". Basically the new
> CPU is "faster" because of a better cycle time and because more efficient
> hardware or millicode. The two together are use to calculate the "MIPS" of
> the machine. Of course MIPS and even MSUs are now only marketting propoganda
> with little technical meaning.
>

Indeed. This is a great post.

I'd add that things like pipelining and out-of-order execution and the like
can have large and unintuitive effects.

I remember some old IBM code (EXEC 2 source on VM) that had comments like
"do this while R2 settles"; that surely held true when the code was written,
and for at least ten minutes after that. Now on zEnterprise, the hardware
makes those decisions for you -- says "OK, he's doing a load into R2, then
an add to R2, then an unrelated load into R3 -- I can swap the last two
instructions and make it faster without changing the result" (no, I don't
claim that's a real example, just the kind of thing that OOO execution
enables).

So a LR instruction on a 4.4GHz z10 *might* be 15% slower than on a 5.2GHz
zEnterprise, or it might be 20% or 10% slower. And thanks to OOO execution,
that will even vary from LR to LR.

The good news is, this is a performance question, so an answer of "it
depends" is traditional and thus appropriate!
-- 
zMan -- "I've got a mainframe and I'm not afraid to use it"

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html

Reply via email to