Well, I for one don't go along with the obsession with saving a nanosecond here and there. In any case, as someone just pointed out, a compiler can do far more optimization than one can manage by hand, and compiler writers spend large amounts of time determining optimal instruction sequences for certain operations and developing algorithms to compile an optimal solution for each piece of code. Modern software systems like Microsoft .Net can compile at run-time for the current hardware architecture. So much for TRT vs TRTE. Few active product developers have time to really learn all the endless new z/Architecture instructions, anyway, I suggest, and continually compare them and determine which would be optimal in this or that situation. (But maybe I'm just too lazy nowadays).
What rarely (if ever) gets addressed here is "good programming practices" (in a wider sense than how to load a base register or whatever), something you continually encounter on HLL forums, but rarely, somehow, in assembler. That for me means writing code that works correctly, is understandable, reflects in structure the logic of the problem, and can be easily modified and expanded in scope. Cryptically clever code is generally best avoided, much as it seems to appeal to certain kinds of programmers. Efficiency of individual small code sections is mostly pretty irrelevant unless at the center of a loop which is executed a vast number of times. Not so long ago it was suggested by one of the more august personalities here that I should not use a system macro for its intended purpose but rather some allegedly quicker set of instructions accessing the same data via control block pointers. However, since the code is executed once at start-up of a permanently active STC the issue ! of speed was not very relevant. Good practices in my view would also exclude enormous code sections requiring numerous base registers (even if replaced by relative branches). Our coding standards never gave rise to a need for more than one code base register, although it's all "baseless" nowadays and uses 64 bit code and the odd ZS3 instruction, even. In fact we recently implemented a pre-loader with the aim of loading different code versions for modern or older machines, but have seen no pressing need to use it yet for that purpose (it has other functions as well). "Structured" programming as small logical sections is something that can be practiced in assembler too. I am responsible for several products in use around the world in large IBM mainframe computer centers. They are all written in assembler (for various good reasons from the distant past, starting again today might change things of course). Although we occasionally hear a customer complain that we are using too much CPU, it is generally due to ! poor use of the products' facilities and not to obvious weaknesses in the code. Speed in a program depends often more on the architecture of code than on individual instructions. Running serially through long lists or tables to find stuff is a common cause of CPU "hotspots". One solution is to use a hash table. Using methods like bubble-sort rather than say quicksort algorithms to sort data in storage makes the programming easy, but much, much slower. Of course in pure assembler these kind of things have to be programmed. (The nicer part of using HLL - in the wider world of Java, C#, C++ etc. anyway - is having large libraries of functions available to do such things). Misuse of system functions can cause issues too, some of our early I/O code caused problems by issuing unnecessary PGSER RELEASE requests, for example. Such things can be determined by suitable tools. Btw, my last reply to one of your posts got caught by the reply-to issue, but I didn't feel a great need to post it again to the list. It wasn't my intention to reply to you personally. DS -----Ursprüngliche Nachricht----- Von: IBM Mainframe Assembler List [mailto:ASSEMBLER-LIST@LISTSERV.UGA.EDU] Im Auftrag von Scott Ford Gesendet: Mittwoch, 17. April 2013 00:55 An: ASSEMBLER-LIST@LISTSERV.UGA.EDU Betreff: Re: Millicode Instructions Ed, I want to ask a question, in this day/age and processing power is it really worth being concerned about Assembler instructions speed ? Unless there is some application that is very time sensitive, that I understand Regards, Scott J Ford Software Engineer http://www.identityforge.com/ ________________________________ From: Ed Jaffe <edja...@phoenixsoftware.com> To: ASSEMBLER-LIST@LISTSERV.UGA.EDU Sent: Tuesday, April 16, 2013 6:13 PM Subject: Re: Millicode Instructions On 4/16/2013 12:43 PM, Gibney, Dave wrote: > I don't get to work at this level often, but I am always interested. > How can Millicode be faster than the equivalent using the hardware > instructions? As I understand Millicode, that is really all it is > (using the hardware instructions) plus any overhead in context > switching to the Millicode "environment". For the MVC/MVCL option, I > can imagine a macro which generates an MVC loop, or unroll the loop > into a sequence of MVC, or generate the MVCL depending on several > criteria. I currently don't have the knowledge to determine the > criteria and I would expect the criteria to change over time Some millicode instructions will outperform their PoOp-code counterparts because millicode has access to hardware features not available to ordinary code. For example, MVCL(E) has the ability to move data under certain conditions without loading it into cache. (You can't do that with looping MVC.) Millicode routines also have access to the MVCX instruction which performs a variable-length MVC -- something ordinary programs cannot do without using the EXecute instruction. Furthermore, a millicode instruction is perceived by the architecture as a single instruction. This allows millicode to do things that cannot be simulated in ordinary code. For example, it would be impossible to write a simulation of the PLO instruction. -- Edward E Jaffe Phoenix Software International, Inc 831 Parkview Drive North El Segundo, CA 90245 http://www.phoenixsoftware.com/