Re: EXECUTE Instruction and location of its target instruction
It's not like we would walk all the way to the location that "more far" is expensive. It's in the same cache line or it's not. I have a macro to generate the target in-line to ensure that HLASM knows the USING that applies to the target, accepting the fact that I sometimes need to branch over it. Like MOVE MVC A(0),B EX R1,MOVE Rumor was that walking over the target before executing it was so common that the CPU exploited the fact that the instruction was still in the pipeline or at least outweigh the branch over it. But I have not been able to show the difference other than in code length.On some models it was not good to have the target in the literal pool, and I did not feel like doing an extra code section for them. Rather than dreaming of huge savings, measure before you make changes. And understand things will be different with the next model. Most likely for 90% of the code you will never gain back the time you spend on this. Rob On 23 November 2016 at 10:15, Philippe Cloarec wrote: > Hi, > > My understanding is, we should keep as closest as possible, the EXECUTE > instruction and its target instruction...EXECUTE instruction being greedy > enough in term of CPU use...to be clear dozens of Cycles needed to complete > its execution. > > Reviewing old Assembler programs, I guess I am surprised to see enough > often that all target instructions for ALL the Execute instructions coded > in the programs grouped mostly at the end... > > Sometimes I can see offset between EXECUTION and its Target instruction to > be enough BIG like below!!: > > > 001214 4490 B94201942 1142EX > R9,LIBCLE2 > .. > .. > 001942 D200 BC76 BD3F 01C76 01D3F 1607 LIBCLE2 MVC SCLE2(0),ZONLIB > > > > I did read many articles, and read often EX should be closed to Target > instruction but no recommendation in term of offset between both elements > ?!? From my humble understanding, more far is the Target instruction from > EX one more costly will be its execution - right ? ; So knowing dozens of > Cycles is "normally" required to complete one EX instruction, actually to > change the program to minimize the offset between EX and its Target > instruction CAN GREATLY reduce CPU use by comparaison with previous code - > right ? > > Thx in advance for input you may have. regards Philippe >
Re: EXECUTE Instruction and location of its target instruction
Hi Rob, ref:It's in the same cache line or it's not. Actually this was exactly my point and I was confused about seen various coding approaches to reach that purpose to optimize as much as possible CPU use. later Philippe
Re: EXECUTE Instruction and location of its target instruction
Rob, >> Most likely for 90% of the code you will never gain back the time you >> spend on this. I do more than agree (maybe even 99%) but I will insist that making code "baseless" eliminates all possibilities I can think of (with 370 instructions) to modify code on the fly and thus improve performance. If you do it with IEABRC, IEABRCX, MAKEREL (by hand is error-prone, but ...) is up to preference. Martin
Re: EXECUTE Instruction and location of its target instruction
Martin, Thx for the input. Unfortunately for current project I am working on, to convert old asm programs to Baseless processing is NOT an option, there is no time and no budget to :) . Later Philippe
Re: EXECUTE Instruction and location of its target instruction
As long as the target of the execute is not in a D-bank cache line, you shouldn't see a big "hit" in processing cycles. If it is, then it has to be flushed out, refetched to I-bank, pulled into the pipe, processed, flushed back out, and then back to D-bank (if still needed). This is based on my memory of the process when we first went to 256 byte cache lines w/ I/D bank caching. Sent from my iPhone > On Nov 23, 2016, at 4:57 AM, Philippe Cloarec > wrote: > > Hi Rob, > > ref:It's in the same cache line or it's not. > > Actually this was exactly my point and I was confused about seen various > coding approaches to reach that purpose to optimize > as much as possible CPU use. > > later > > Philippe
Re: EXECUTE Instruction and location of its target instruction
Hi Steve, Thx for your input, yes this is my understanding of the process. Philippe
Re: EXECUTE Instruction and location of its target instruction
BTW -- I understand your situation. I have ALC code that I have to bring forward from a 1980ish coding style to RENT and G3 chipset compliant. And while it is high priority to get done, I get hit with higher priority work. So I get limited time to actually make things happen. I make large use of IBM macros to generate relative branch from old style, so I don't have to physically touch the instructions. That gives me time to solve I/D bank collisions among other RENT problems. Sent from my iPhone > On Nov 23, 2016, at 9:32 AM, Philippe Cloarec > wrote: > > Hi Steve, > Thx for your input, yes this is my understanding of the process. > Philippe
Re: EXECUTE Instruction and location of its target instruction
Philippe, Most of your questions have already been answered by others. The one thing not yet addressed is this one: > Reviewing old Assembler programs, I guess I am surprised to see enough often > that all target instructions for ALL the Execute instructions coded in the > programs grouped mostly at the end... > The reason for this is, in (much) older hardware there was no I-cache versus D-cache, so the performance penalty for not being in the same cache line did not exist. I hope this helps - not in solving your problem, but in understanding how things have come about. Kind regards, Abe Kornelis ===
Re: EXECUTE Instruction and location of its target instruction
Hi Steve, Thx for the additional infos relating some similar experience you got. Philippe
Re: EXECUTE Instruction and location of its target instruction
Hi Abe, Very good point, about old processors. Thx much Philippe
Rif: Re: EXECUTE Instruction and location of its target instruction
I think it is appropriate to use a EXRL (execute remote) intest a EX. I also think that it is appropriate to place the subject of education execute close to the EX, preferably after a statement of unconditional branch. . xc out,out . La 1,l’in-1 . Exrl1,mvc_instr La 1,1(1) St 1,out_lgth . J ex_done Mvc_instr mvc out(1),in Ex_done ds 0h * data bank. Out_lgthds f. Out ds cl256 In dc cl8’short’ Aldo Crosio tel: 051-4991812 /3488858416 fax: 051-6255762 CSE Consorzio Servizi Bancari Società consortile a responsabilità limitata Via Emilia n. 272 40068-San Lazzaro di Savena (BO) Ai sensi del D.Lgs. 196/2003 si precisa che le informazioni contenute nel presente messaggio, corredato dei relativi allegati, sono strettamente riservate ed a uso esclusivo dei destinatari. Qualora Le fosse pervenuto per errore, La invitiamo ad eliminarlo immediatamente, dandocene gentilmente comunicazione. Grazie. *** *** ***
Re: Rif: Re: EXECUTE Instruction and location of its target instruction
Hi Aldo, Thx much for the input. Unfortunately I cannot for that project to implement Baseless processing therefore I cannot use EXRL instruction. Point is to keep the target of the execute is in a I-bank cache line, to be ok. regards Philippe
Re: Rif: Re: EXECUTE Instruction and location of its target instruction
On 2016-11-23 07:19, aldo.cro...@csebo.it wrote: > I think it is appropriate to use a EXRL (execute remote) intest a EX. > I also think that it is appropriate to place the subject of education > execute close to the EX, preferably after a statement of unconditional > branch. > Is it recommended for legibility/maintainability that the subject appear adjacent to the EX rather than after a nearby unrelated branch? What effect does an unconditional branch have on branch prediction/pipelining? http://www.wrenvironmental.com/commercial/services/pipelining/ Is LOCTR a help? I can imagine the frustration of a programmer trying to correlate a dump with a listing where the author has used LOCTR heavily and wishing that HLASM had an option to generate SYSPRINT in address order rather than in source order. Does HLASM have an instruction to cause cache line alignment? Such an instruction would need to be model-sensitive, perhaps governed by OPTABLE. -- gil
Re: Rif: Re: EXECUTE Instruction and location of its target instruction
Hi Paul, Thx much for the input. ref:Is LOCTR a help? Actually, I got nicely some infos from Martin T, about LOCTR possible use in such case, this can be found at http://www.pi-sysprog.de/free/makerel.html Philippe
Re: Rif: Re: EXECUTE Instruction and location of its target instruction
The closet instruction is HLASM has for what Gil asked is CNOP - updated a couple of years ago by apar PI17455 - but you do need to know what your cache lines are. see http://www.ibm.com/support/docview.wss?uid=isg1PI17455 and http://www.ibm.com/support/docview.wss?uid=swg21687009 Where the OPTABLE value is not one of DOS, 370 or XA, the CNOP generated no-operation instructions are BRC or BRCL instructions. Sharuff IBM Mainframe Assembler List wrote on 24/11/2016 00:50:16: > From: Paul Gilmartin <0014e0e4a59b-dmarc-requ...@listserv.uga.edu> > To: ASSEMBLER-LIST@LISTSERV.UGA.EDU > Date: 24/11/2016 00:50 > Subject: Re: Rif: Re: EXECUTE Instruction and location of its target > instruction > Sent by: IBM Mainframe Assembler List > > On 2016-11-23 07:19, aldo.cro...@csebo.it wrote: > > I think it is appropriate to use a EXRL (execute remote) intest a EX. > > I also think that it is appropriate to place the subject of education > > execute close to the EX, preferably after a statement of unconditional > > branch. > > > Is it recommended for legibility/maintainability that the subject appear > adjacent to the EX rather than after a nearby unrelated branch? > > What effect does an unconditional branch have on branch prediction/pipelining? > http://www.wrenvironmental.com/commercial/services/pipelining/ > > Is LOCTR a help? I can imagine the frustration of a programmer trying to > correlate a dump with a listing where the author has used LOCTR heavily > and wishing that HLASM had an option to generate SYSPRINT in address order > rather than in source order. > > Does HLASM have an instruction to cause cache line alignment? Such an > instruction would need to be model-sensitive, perhaps governed by OPTABLE. > > -- gil > Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
Re: Rif: Re: EXECUTE Instruction and location of its target instruction
You can make a macro that will do this. At Amdahl we had such at one time. Where I currently work, we have a macro called ALIGN. It was developed long before the G3 chipset (where 256byte cache lines came about IIRC). If you make use of &SYSPARM you could control how your expansion works, by passing in the cache line info in some fashion. For now, I use it to make static storage start on a new cache line -- by making the CSECT start on a page boundary. Note, HLASM can't align to something unless you have a known starting point. So by making your CSECTs be page aligned, you can know what will be 32byte, 256byte, etc aligned. Should alignment of cache not be evenly placed in pages, there goes that methodology. Sent from my iPhone > On Nov 23, 2016, at 7:50 PM, Paul Gilmartin > <0014e0e4a59b-dmarc-requ...@listserv.uga.edu> wrote: > > Does HLASM have an instruction to cause cache line alignment? Such an > instruction would need to be model-sensitive, perhaps governed by OPTABLE. > > -- gil
Re: Rif: Re: EXECUTE Instruction and location of its target instruction
Hi Steve, Thx for the input. Right. Yes, I know about the idea to have a macro to align to a 256 bytes boundary. Philippe
Rif: Re: Rif: Re: EXECUTE Instruction and location of its target instruction
you can also use excel not baseless without problems. Aldo Crosio tel: 051-4991812 /3488858416 fax: 051-6255762 CSE Consorzio Servizi Bancari Società consortile a responsabilità limitata Via Emilia n. 272 40068-San Lazzaro di Savena (BO) Ai sensi del D.Lgs. 196/2003 si precisa che le informazioni contenute nel presente messaggio, corredato dei relativi allegati, sono strettamente riservate ed a uso esclusivo dei destinatari. Qualora Le fosse pervenuto per errore, La invitiamo ad eliminarlo immediatamente, dandocene gentilmente comunicazione. Grazie. *** *** ***
Rif: Re: Rif: Re: EXECUTE Instruction and location of its target instruction
> What effect does an unconditional branch have on branch prediction/pipelining? > Is LOCTR a help? I can imagine the frustration of a programmer trying to > Does HLASM have an instruction to cause cache line alignment? Such an > instruction would need to be model-sensitive, perhaps governed by OPTABLE. the object is placed near the EXRL to improve the readability of the program. unconditional branch can be one already present more near (before or after the exrl) when an instruction is encoded this is always aligned at half word ctive Usings: hash,R4 cnv$$,R11 CEECAA,R12 CEEDSA,R13 ocObject Code Addr1Addr2Stmt Source Statement 00234 715ds 0f 00234 A7 716dc cl1'x' 00235 00 00236 D200 1000 2000 717 exobj mvc 0(1,1),0 (2) 0023C 81 718dc cl1'a' 0023D 00 0023E D200 1000 2000 719 exobj1 mvc 0(1,1),0 (2) the subject of EXRL can be positioned anywhere in the "not writable" encoding. even within a CSECT / RSECT different from the primary. the possible use of LOCTR does not affect relative addressing. in case of statement object to another csect the relative address is calculated by the linkage editor Active Usings: None LocObject Code Addr1Addr2Stmt Source Statement 00 0032 1 PREXRL RSECT 00 B240 00E0 2 BAKR 14,0 04 C600 0013 002A 3 EXRL 0,A 0A C600 0013 0030 4 EXRL 0,B 10 C600 FFF8 5 EXRL 0,C 16 C600 FFF7 0004 6 EXRL 0,D 1C C010 FFF2 7 LARL 1,PREXRL 22 8 DCX'' 26 17FF 9 XR15,15 28 0101 10 PR 2A 1711 11 AXR1,1 2C 404040 12 DCCL3' ' 2F 00 30 1711 13 BXR1,1 00 0006 14 PRXX RSECT 00 1711 15 CXR1,1 02 40 16 DCCL1' ' 03 00 04 1711 17 DXR1,1 18 END The example above states as mentioned earlier. in fact its run ends with abend 0c1 and r1 points to start the module. in memory of EXRL instructions are: startin at 7FC0 B24000E0bakr C613exrl C613exrl C614exrl C613exrl C010FfF2larl dc x’’ 17FFxr 0101pr executable analysis. from the dump, taken from the library, you see that their addresses are already solved. BROWSESAC058.LINK.LIB(PREXRL) Line 00 Col 481 560 Command ===> Scroll ===> CSR * Top of Data ** F.F.F.F. C1 C1 C1 C1 63 63 64 63 Like in abend. Aldo Crosio tel: 051-4991812 /3488858416 fax: 051-6255762 CSE Consorzio Servizi Bancari Società consortile a responsabilità limitata Via Emilia n. 272 40068-San Lazzaro di Savena (BO) Ai sensi del D.Lgs. 196/2003 si precisa che le informazioni contenute nel presente messaggio, corredato dei relativi allegati, sono strettamente riservate ed a uso esclusivo dei destinatari. Qualora Le fosse pervenuto per errore, La invitiamo ad eliminarlo immediatamente, dandocene gentilmente comunicazione. Grazie. *** *** ***
Re: Rif: Re: Rif: Re: EXECUTE Instruction and location of its target instruction
Aldo, Yes I am aware of this. Thx Philippe
Re: Rif: Re: Rif: Re: EXECUTE Instruction and location of its target instruction
Your listing confused me. At the latest HLASM version (UI42852) I get: LocObject Code Addr1Addr2Stmt Source Statement 1 * from ASSEMBLER-LIST 24/11/2016 0032 2 PREXRL RSECT B240 00E0 3 BAKR 14,0 0004 C600 0013 002A 4 EXRL 0,A 000A C600 0013 0030 5 EXRL 0,B 0010 C600 0014 0038 6 EXRL 0,C 0016 C600 0013 003C 7 EXRL 0,D 001C C010 FFF2 8 LARL 1,PREXRL 0022 9 DC X'' 0026 17FF 10 XR15,15 0028 0101 11 PR 002A 1711 12 AXR1,1 002C 404040 13 DCCL3' ' 002F 00 0030 1711 15 BXR1,1 00380038 0006 16 PRXX RSECT 0038 1711 17 CXR1,1 003A 40 18 DCCL1' ' 003B 00 003C 1711 20 DXR1,1 21 END with RLD xref correctly showing: Relocation Dictionary Pos.Id Rel.Id Address TypeLength Action 0004 0009 0012 RI 4 + 0004 0009 0018 RI 4 + The data in ADDR2 is correct (and useful), as well as the instructions printed. Sharuff IBM Mainframe Assembler List wrote on 24/11/2016 11:12:52: > From: aldo.cro...@csebo.it > To: ASSEMBLER-LIST@LISTSERV.UGA.EDU > Date: 24/11/2016 11:13 > Subject: Rif: Re: Rif: Re: EXECUTE Instruction and location of its > target instruction > Sent by: IBM Mainframe Assembler List > > > What effect does an unconditional branch have on branch > prediction/pipelining? > > Is LOCTR a help? I can imagine the frustration of a programmer trying > to > > Does HLASM have an instruction to cause cache line alignment? Such an > > instruction would need to be model-sensitive, perhaps governed by > OPTABLE. > > > > the object is placed near the EXRL to improve the readability of the > program. > unconditional branch can be one already present more near (before or after > the exrl) > > when an instruction is encoded this is always aligned at half word > > ctive Usings: hash,R4 cnv$$,R11 CEECAA,R12 CEEDSA,R13 > ocObject Code Addr1Addr2Stmt Source Statement > 00234 715ds 0f > 00234 A7 716dc cl1'x' > 00235 00 > 00236 D200 1000 2000 717 exobj mvc 0(1,1),0 > (2) > 0023C 81 718dc cl1'a' > 0023D 00 > 0023E D200 1000 2000 719 exobj1 mvc 0(1,1),0 > (2) > > > > > > > the subject of EXRL can be positioned anywhere in the "not writable" > encoding. even within a CSECT / RSECT different from the primary. > the possible use of LOCTR does not affect relative addressing. > in case of statement object to another csect the relative address is > calculated by the linkage editor > > Active Usings: None > LocObject Code Addr1Addr2Stmt Source Statement > 00 0032 1 PREXRL RSECT > 00 B240 00E0 2 BAKR 14,0 > 04 C600 0013 002A 3 EXRL 0,A > 0A C600 0013 0030 4 EXRL 0,B > 10 C600 FFF8 5 EXRL 0,C > 16 C600 FFF7 0004 6 EXRL 0,D > 1C C010 FFF2 7 LARL 1,PREXRL > 22 8 DC X'' > 26 17FF 9 XR15,15 > 28 0101 10 PR > 2A 1711 11 AXR1,1 > 2C 404040 12 DCCL3' ' > 2F 00 > 30 1711 13 BXR1,1 > 00 0006 14 PRXX RSECT > 00 1711
Rif: Re: Rif: Re: Rif: Re: EXECUTE Instruction and location of its target instruction
the difference is the option used THREAD NOTHREAD Default THREAD THREAD Specifies that the assembler not reset the location counter to zero at the beginning of each CSECT. NOTHREAD Specifies that the assembler reset the location counter to zero at the beginning of each CSECT, except for the first CSECT when it is initiated by the START instruction having a nonzero operand. I think that "NOTHREAD" makes it easier to find an area in a "csect" while reading a dump. Param notrhread R-LocObject Code Addr1Addr2Stmt Source Statement 002A 1 PREXRL RSECT B240 00E0 2 BAKR 14,0 0004 C600 FFFE 3 EXRL 0,A 000A C600 FFFE 0006 4 EXRL 0,B 0010 C600 FFFC 0008 5 EXRL 0,C 0016 C600 FFFB 000C 6 EXRL 0,D 000E 7 PRXX RSECT 1711 8 AXR1,1 0002 4040409 DCCL3' ' 0005 00 0006 1711 10 BXR1,1 0008 1711 11 CXR1,1 001C 002A 12 PREXRL RSECT 001C C010 FFF2 13 LARL 1,PREXRL 0022 14 DC X'' 0026 17FF 15 XR15,15 0028 0101 16 PR 000A 000E 17 PRXX RSECT 000A 40 18 DCCL1' ' 000B 00 000C 1711 19 DXR1,1 20 END SECTIONCLASS --- SOURCE OFFSET OFFSET NAMETYPELENGTH DDNAME SEQ MEMBER 0 PREXRL CSECT2A OBJLIB01 PREXRL 30 PRXX CSECT E OBJLIB01 PREXRL *** DATA SET SUMMARY *** Param threadr 002A 1 PREXRL RSECT B240 00E0 2 BAKR 14,0 0004 C600 0016 0030 3 EXRL 0,A 000A C600 0016 0036 4 EXRL 0,B 0010 C600 0014 0038 5 EXRL 0,C 0016 C600 0013 003C 6 EXRL 0,D 00300030 000E 7 PRXX RSECT 0030 1711 8 AXR1,1 0032 4040409 DCCL3' ' 0035 00 0036 1711 10 BXR1,1 0038 1711 11 CXR1,1 001C 002A 12 PREXRL RSECT 001C C010 FFF2 13 LARL 1,PREXRL 0022 14 DCX'' 0026 17FF 15 XR15,15 0028 0101 16 PR 003A0030 000E 17 PRXX RSECT 003A 40 18 DCCL1' ' 003B 00 003C 1711 19 DXR1,1 20END Output linkage editor:. OFFSET OFFSET NAMETYPELENGTH DDNAME SEQ MEMBER 0 PREXRL CSECT2A OBJLIB01 PREXRL 30 PRXX CSECT E OBJLIB01 PREXRL *** DATA SET SUMMARY *** Aldo Crosio tel: 051-4991812 /3488858416 fax: 051-6255762 CSE Consorzio Servizi Bancari Società consortile a responsabilità limitata Via Emilia n. 272 40068-San Lazzaro di Savena (BO) Ai sensi del D.Lgs. 196/2003 si precisa che le informazioni contenute nel presente messaggio, corredato dei relativi allegati, sono strettamente riservate ed a uso esclusivo dei destinatari. Qualora Le fosse pervenuto per errore, La invitiamo ad eliminarlo immediatamente, dandocene gentilmente comunicazione. Grazie. *** *** ***