Re: Millicode Instructions
MVCX sounds a bit like MVCOS with R00=0. The PoOP says MVCOS may be significantly slower than MVC, but I would be interested to see a comparison between it and an executed MVC i.e. for use in short(ish) variable length moves. Robert Ngan CSC Financial Services Group From: Peurifoy, Richard L r-peuri...@neo.tamu.edu To: ASSEMBLER-LIST@LISTSERV.UGA.EDU Date: 2013/04/17 10:49 Subject:Re: Millicode Instructions Sent by:IBM Mainframe Assembler List ASSEMBLER-LIST@LISTSERV.UGA.EDU Some millicode instructions will outperform their PoOp-code counterparts because millicode has access to hardware features not available to ordinary code. For example, MVCL(E) has the ability to move data under certain conditions without loading it into cache. (You can't do that with looping MVC.) Millicode routines also have access to the MVCX instruction which performs a variable-length MVC -- something ordinary programs cannot do without using the EXecute instruction. MVCX sounds like it would be usefull for non-millicode, any idea why it was not externalized? Is there a coresponding CLCX? -- Richard
Re: Millicode Instructions
On 4/17/2013 8:44 AM, Peurifoy, Richard L wrote: Is there a coresponding CLCX? I assume yes, although I know not for sure... -- Edward E Jaffe Phoenix Software International, Inc 831 Parkview Drive North El Segundo, CA 90245 http://www.phoenixsoftware.com/
Re: Millicode Instructions
Well, I for one don't go along with the obsession with saving a nanosecond here and there. In any case, as someone just pointed out, a compiler can do far more optimization than one can manage by hand, and compiler writers spend large amounts of time determining optimal instruction sequences for certain operations and developing algorithms to compile an optimal solution for each piece of code. Modern software systems like Microsoft .Net can compile at run-time for the current hardware architecture. So much for TRT vs TRTE. Few active product developers have time to really learn all the endless new z/Architecture instructions, anyway, I suggest, and continually compare them and determine which would be optimal in this or that situation. (But maybe I'm just too lazy nowadays). What rarely (if ever) gets addressed here is good programming practices (in a wider sense than how to load a base register or whatever), something you continually encounter on HLL forums, but rarely, somehow, in assembler. That for me means writing code that works correctly, is understandable, reflects in structure the logic of the problem, and can be easily modified and expanded in scope. Cryptically clever code is generally best avoided, much as it seems to appeal to certain kinds of programmers. Efficiency of individual small code sections is mostly pretty irrelevant unless at the center of a loop which is executed a vast number of times. Not so long ago it was suggested by one of the more august personalities here that I should not use a system macro for its intended purpose but rather some allegedly quicker set of instructions accessing the same data via control block pointers. However, since the code is executed once at start-up of a permanently active STC the issue ! of speed was not very relevant. Good practices in my view would also exclude enormous code sections requiring numerous base registers (even if replaced by relative branches). Our coding standards never gave rise to a need for more than one code base register, although it's all baseless nowadays and uses 64 bit code and the odd ZS3 instruction, even. In fact we recently implemented a pre-loader with the aim of loading different code versions for modern or older machines, but have seen no pressing need to use it yet for that purpose (it has other functions as well). Structured programming as small logical sections is something that can be practiced in assembler too. I am responsible for several products in use around the world in large IBM mainframe computer centers. They are all written in assembler (for various good reasons from the distant past, starting again today might change things of course). Although we occasionally hear a customer complain that we are using too much CPU, it is generally due to ! poor use of the products' facilities and not to obvious weaknesses in the code. Speed in a program depends often more on the architecture of code than on individual instructions. Running serially through long lists or tables to find stuff is a common cause of CPU hotspots. One solution is to use a hash table. Using methods like bubble-sort rather than say quicksort algorithms to sort data in storage makes the programming easy, but much, much slower. Of course in pure assembler these kind of things have to be programmed. (The nicer part of using HLL - in the wider world of Java, C#, C++ etc. anyway - is having large libraries of functions available to do such things). Misuse of system functions can cause issues too, some of our early I/O code caused problems by issuing unnecessary PGSER RELEASE requests, for example. Such things can be determined by suitable tools. Btw, my last reply to one of your posts got caught by the reply-to issue, but I didn't feel a great need to post it again to the list. It wasn't my intention to reply to you personally. DS -Ursprüngliche Nachricht- Von: IBM Mainframe Assembler List [mailto:ASSEMBLER-LIST@LISTSERV.UGA.EDU] Im Auftrag von Scott Ford Gesendet: Mittwoch, 17. April 2013 00:55 An: ASSEMBLER-LIST@LISTSERV.UGA.EDU Betreff: Re: Millicode Instructions Ed, I want to ask a question, in this day/age and processing power is it really worth being concerned about Assembler instructions speed ? Unless there is some application that is very time sensitive, that I understand Regards, Scott J Ford Software Engineer http://www.identityforge.com/ From: Ed Jaffe edja...@phoenixsoftware.com To: ASSEMBLER-LIST@LISTSERV.UGA.EDU Sent: Tuesday, April 16, 2013 6:13 PM Subject: Re: Millicode Instructions On 4/16/2013 12:43 PM, Gibney, Dave wrote: I don't get to work at this level often, but I am always interested. How can Millicode be faster than the equivalent using the hardware instructions? As I understand Millicode, that is really all it is (using the hardware instructions) plus any overhead in context switching to the Millicode
Re: Millicode Instructions
Some millicode instructions will outperform their PoOp-code counterparts because millicode has access to hardware features not available to ordinary code. For example, MVCL(E) has the ability to move data under certain conditions without loading it into cache. (You can't do that with looping MVC.) Millicode routines also have access to the MVCX instruction which performs a variable-length MVC -- something ordinary programs cannot do without using the EXecute instruction. MVCX sounds like it would be usefull for non-millicode, any idea why it was not externalized? Is there a coresponding CLCX? -- Richard
Re: Millicode Instructions
Dave Gibney wrote: begin extract How can Millicode be faster than the equivalent using the hardware instructions? As I understand Millicode, that is really all it is (using the hardware instructions) plus any overhead in context switching to the Millicode environment. /end extract This is a common misunderstanding that has unfortunately been repeated many times. It is a radically misleading caricature. Millicode makes available many facilities not available in the HLASM. It does not make additional machine instructions available, but it does make its own powerful facilities for specifying the path pf control among them available. I have always felt some impatience with this view. If it were at all accurate it would make millicode, which goes back to the System/390, unimportant, even dispensable; and, while IBM is not infallible, it is deeply serious about its hardware investments. GIYF. To begin see (watch wrap) http://ecc.marist.edu/conf2011/materials/SlegelSystemZ_APeekUnderTheHood_Slegel_MaristECC.pdf. John Gilmore, Ashland, MA 01721 - USA
Re: Millicode Instructions
On 4/16/2013 12:43 PM, Gibney, Dave wrote: I don't get to work at this level often, but I am always interested. How can Millicode be faster than the equivalent using the hardware instructions? As I understand Millicode, that is really all it is (using the hardware instructions) plus any overhead in context switching to the Millicode environment. For the MVC/MVCL option, I can imagine a macro which generates an MVC loop, or unroll the loop into a sequence of MVC, or generate the MVCL depending on several criteria. I currently don't have the knowledge to determine the criteria and I would expect the criteria to change over time Some millicode instructions will outperform their PoOp-code counterparts because millicode has access to hardware features not available to ordinary code. For example, MVCL(E) has the ability to move data under certain conditions without loading it into cache. (You can't do that with looping MVC.) Millicode routines also have access to the MVCX instruction which performs a variable-length MVC -- something ordinary programs cannot do without using the EXecute instruction. Furthermore, a millicode instruction is perceived by the architecture as a single instruction. This allows millicode to do things that cannot be simulated in ordinary code. For example, it would be impossible to write a simulation of the PLO instruction. -- Edward E Jaffe Phoenix Software International, Inc 831 Parkview Drive North El Segundo, CA 90245 http://www.phoenixsoftware.com/
Re: Millicode Instructions
Ed, I want to ask a question, in this day/age and processing power is it really worth being concerned about Assembler instructions speed ? Unless there is some application that is very time sensitive, that I understand Regards, Scott J Ford Software Engineer http://www.identityforge.com/ From: Ed Jaffe edja...@phoenixsoftware.com To: ASSEMBLER-LIST@LISTSERV.UGA.EDU Sent: Tuesday, April 16, 2013 6:13 PM Subject: Re: Millicode Instructions On 4/16/2013 12:43 PM, Gibney, Dave wrote: I don't get to work at this level often, but I am always interested. How can Millicode be faster than the equivalent using the hardware instructions? As I understand Millicode, that is really all it is (using the hardware instructions) plus any overhead in context switching to the Millicode environment. For the MVC/MVCL option, I can imagine a macro which generates an MVC loop, or unroll the loop into a sequence of MVC, or generate the MVCL depending on several criteria. I currently don't have the knowledge to determine the criteria and I would expect the criteria to change over time Some millicode instructions will outperform their PoOp-code counterparts because millicode has access to hardware features not available to ordinary code. For example, MVCL(E) has the ability to move data under certain conditions without loading it into cache. (You can't do that with looping MVC.) Millicode routines also have access to the MVCX instruction which performs a variable-length MVC -- something ordinary programs cannot do without using the EXecute instruction. Furthermore, a millicode instruction is perceived by the architecture as a single instruction. This allows millicode to do things that cannot be simulated in ordinary code. For example, it would be impossible to write a simulation of the PLO instruction. -- Edward E Jaffe Phoenix Software International, Inc 831 Parkview Drive North El Segundo, CA 90245 http://www.phoenixsoftware.com/
Re: Millicode Instructions
For us, yes. We pay most of our software based on MSU usage. My boss says that one MSU reduction will save us $13,000/yr. Is this huge? To us, yes. We must constantly fight the management belief that Windows is better! Cheaper! faster! If some company could do a conversion with a 1 year ROI, they would go full blast without any other consideration being looked at. On Apr 16, 2013 5:56 PM, Scott Ford scott_j_f...@yahoo.com wrote: Ed, I want to ask a question, in this day/age and processing power is it really worth being concerned about Assembler instructions speed ? Unless there is some application that is very time sensitive, that I understand Regards, Scott J Ford Software Engineer http://www.identityforge.com/ From: Ed Jaffe edja...@phoenixsoftware.com To: ASSEMBLER-LIST@LISTSERV.UGA.EDU Sent: Tuesday, April 16, 2013 6:13 PM Subject: Re: Millicode Instructions On 4/16/2013 12:43 PM, Gibney, Dave wrote: I don't get to work at this level often, but I am always interested. How can Millicode be faster than the equivalent using the hardware instructions? As I understand Millicode, that is really all it is (using the hardware instructions) plus any overhead in context switching to the Millicode environment. For the MVC/MVCL option, I can imagine a macro which generates an MVC loop, or unroll the loop into a sequence of MVC, or generate the MVCL depending on several criteria. I currently don't have the knowledge to determine the criteria and I would expect the criteria to change over time Some millicode instructions will outperform their PoOp-code counterparts because millicode has access to hardware features not available to ordinary code. For example, MVCL(E) has the ability to move data under certain conditions without loading it into cache. (You can't do that with looping MVC.) Millicode routines also have access to the MVCX instruction which performs a variable-length MVC -- something ordinary programs cannot do without using the EXecute instruction. Furthermore, a millicode instruction is perceived by the architecture as a single instruction. This allows millicode to do things that cannot be simulated in ordinary code. For example, it would be impossible to write a simulation of the PLO instruction. -- Edward E Jaffe Phoenix Software International, Inc 831 Parkview Drive North El Segundo, CA 90245 http://www.phoenixsoftware.com/