Re: Comparison of compiler generated code AD 1980(ish) v 2010(ish)
In 4fbd69b6.5080...@t-online.de, on 05/24/2012 at 12:50 AM, Bernd Oppolzer bernd.oppol...@t-online.de said: This limit is too high, in our opinion, because some tests showed, On what processor? I assume that IBM wants to optimize for a196 and later. -- Shmuel (Seymour J.) Metz, SysProg and JOAT ISO position; see http://patriot.net/~shmuel/resume/brief.html We don't care. We don't have to care, we're Congress. (S877: The Shut up and Eat Your spam act of 2003) -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Comparison of compiler generated code AD 1980(ish) v 2010(ish)
The test runs were on a z196 with current z/OS release. I didn't do the tests myself, only was told about the results, but the co-worker normally is very reliable. Regards Bernd Am 24.05.2012 12:54, schrieb Shmuel Metz (Seymour J.): In4fbd69b6.5080...@t-online.de, on 05/24/2012 at 12:50 AM, Bernd Oppolzerbernd.oppol...@t-online.de said: This limit is too high, in our opinion, because some tests showed, On what processor? I assume that IBM wants to optimize for a196 and later. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Comparison of compiler generated code AD 1980(ish) v 2010(ish)
Update to this CALL PLIMOVE generates loop of MVCs topic: we did some further research and found out, that for CALL PLIMOVE with known length at compile time, even with EP PL/1 V3.9, there is a code sequence generated which uses MVCL, if the length is greater or equal 16384. For lengths below 16384, PL/1 generates a loop of MVCs. This limit is too high, in our opinion, because some tests showed, that MVCL is faster than MVCs starting from a length of ca. 768 bytes, that is 3 MVCs. 16384 needs 64 MVCs (and maybe loop control instructions). We will ask IBM to change this. Furthermore, I think that MVCL should not only be generated with CALL PLIMOVE, but also with normal assignments, if the length is in the same range. But we did no research on this so far; our focus is on CALL PLIMOVE at the moment. Kind regards Bernd Am 17.05.2012 15:42, schrieb Robert AH Prins: But here we have a simple instruction of the HLL (PLIMOVE) which I expect to be implemented using the best instructions the machine provides. If this turns out not to be the case, this is IMHO simply a bug, not only a flaw of the optimizer. The programmer already did some kind of optimization him- or herself, when he or she decided to use PLIMOVE. He or she may well expect that the compiler generates the best available machine instruction for this HLL instruction. You hit the nail right on the head! But I do remember that there was a APAR that explains why the MVCL was removed again. I can't point you to it as the link to the PL/I APARs has gone 404. ... Robert -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Comparison of compiler generated code AD 1980(ish) v 2010(ish)
On 2012-05-17 11:42, Robert AH Prins wrote: On 2012-05-17 11:14, Bernd Oppolzer wrote: I would like to add: with the previous compiler, CALL PLIMOVE enabled us to force the generation of MVCL. Using, for example CALL PLIMOVE (ADDR (target), ADDR (source), length); the compiler generated MVCL, but coding target = source; (if applicable), or BY-NAME-assignments, the compile generated MVCs etc. Now, with V3.9, the compiler generates the same in the two cases, that is MVCs or MVC loops, so we have no possibility to force the generation of MVCL. AFAIK, my co-workers didn't play with the ARCH options, so far. TUNE is TUNE(2), again AFAIK. The TUNE option has been removed from the V4.1 anyway. I have other projects at the moment, so I had not much time so far to investigate this. But remember: the problem showed up by a Strobe Report, so it seems to be significant. But: if PLIMOVE does no better than a simple assignment, using PLIMOVE seems to make no sense to me. In a certain way, this problem is somewhat different from the first problem in this thread. Robert complained about the optimizer doing a bad job, that is: some instructions are generated that are useless, and others are questionable. But here we have a simple instruction of the HLL (PLIMOVE) which I expect to be implemented using the best instructions the machine provides. If this turns out not to be the case, this is IMHO simply a bug, not only a flaw of the optimizer. The programmer already did some kind of optimization him- or herself, when he or she decided to use PLIMOVE. He or she may well expect that the compiler generates the best available machine instruction for this HLL instruction. You hit the nail right on the head! But I do remember that there was a APAR that explains why the MVCL was removed again. I can't point you to it as the link to the PL/I APARs has gone 404. It's back and I have to correct myself, I've only managed to find a APAR telling that the generation MVCLE 's was removed http://www-01.ibm.com/support/docview.wss?rs=619context=SSY2V3q1=%2b5655H3100+%2br370uid=swg1PK79325loc=en_UScs=utf-8cc=uslang=all Robert -- Robert AH Prins robert(a)prino(d)org -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Comparison of compiler generated code AD 1980(ish) v 2010(ish)
On Fri, May 18, 2012 at 3:54 AM, Robert AH Prins robert.ah.pr...@gmail.com wrote: On 2012-05-17 11:42, Robert AH Prins wrote: On 2012-05-17 11:14, Bernd Oppolzer wrote: I would like to add: with the previous compiler, CALL PLIMOVE enabled us to force the generation of MVCL. Using, for example CALL PLIMOVE (ADDR (target), ADDR (source), length); the compiler generated MVCL, but coding target = source; (if applicable), or BY-NAME-assignments, the compile generated MVCs etc. Now, with V3.9, the compiler generates the same in the two cases, that is MVCs or MVC loops, so we have no possibility to force the generation of MVCL. AFAIK, my co-workers didn't play with the ARCH options, so far. TUNE is TUNE(2), again AFAIK. The TUNE option has been removed from the V4.1 anyway. I have other projects at the moment, so I had not much time so far to investigate this. But remember: the problem showed up by a Strobe Report, so it seems to be significant. But: if PLIMOVE does no better than a simple assignment, using PLIMOVE seems to make no sense to me. In a certain way, this problem is somewhat different from the first problem in this thread. Robert complained about the optimizer doing a bad job, that is: some instructions are generated that are useless, and others are questionable. But here we have a simple instruction of the HLL (PLIMOVE) which I expect to be implemented using the best instructions the machine provides. If this turns out not to be the case, this is IMHO simply a bug, not only a flaw of the optimizer. The programmer already did some kind of optimization him- or herself, when he or she decided to use PLIMOVE. He or she may well expect that the compiler generates the best available machine instruction for this HLL instruction. You hit the nail right on the head! But I do remember that there was a APAR that explains why the MVCL was removed again. I can't point you to it as the link to the PL/I APARs has gone 404. It's back and I have to correct myself, I've only managed to find a APAR telling that the generation MVCLE 's was removed http://www-01.ibm.com/support/docview.wss?rs=619context=SSY2V3q1=%2b5655H3100+%2br370uid=swg1PK79325loc=en_UScs=utf-8cc=uslang=all Robert -- Robert AH Prins robert(a)prino(d)org From the APAR, dated 2009-02-02: Problem summary * USERS AFFECTED: Enterprise PL/I users with code that has * * assignments to NONVARYING CHAR strings where * * the source and/or the target has a length* * not known at compile time* * PROBLEM DESCRIPTION: The 3.6 and later releases of * * Enterprise PL/I generated a MVCLE * * instruction to perform such * * assignments. Unfortunately, while this * * led to a much shorter set of* * instructions, it also led to* * significantly worse performance.* * RECOMMENDATION: * The compiler has been changed so that it no longer generates MVLCE's -- Mike A Schwab, Springfield IL USA Where do Forest Rangers go to get away from it all? -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Comparison of compiler generated code AD 1980(ish) v 2010(ish)
On 17/05/2012 2:06 AM, Tom Marchant wrote: On Tue, 15 May 2012 20:07:52 +, Robert Prins wrote: maybe a 16-byte three-instruction sequence like 003FC0 E310 DF10 0158 003120 | LY r1,a1:d7952:l4(,r13,7952) 003FC6 E300 1047 0015 003120 | LGH r0,_shadow20(,r1,71) 003FCC 4000 E064003120 | STH r0,_shadow20(,r14,100) is really faster than the simple 6-byte one-instruction sequence 0026D4 D2 01 7 064 6 047 MVC REPT_LINE.DATE.MONTH(2),REPT_LIST.DATE.MONTH Not likely. Address Generation Interlock (AGI) will cause the second instruction to stall until the address is available in R1. Tom, I'm not sure if that's still true on z10/z196 processors which implement AGI bypass. http://www.ibmsystemsmag.com/CMSTemplates/IBMSystemsMag/Print.aspx?path=/mainframe/administrator/performance/cpu_pipeline Apparently the worst case scenario is a load in 1 cycle. Load address has been mitigated. In addition, instruction cracking will, under some circumstances, cause a z196 processor to execute a load and a store when a MVC instruction is executed. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Comparison of compiler generated code AD 1980(ish) v 2010(ish)
I would like to add: with the previous compiler, CALL PLIMOVE enabled us to force the generation of MVCL. Using, for example CALL PLIMOVE (ADDR (target), ADDR (source), length); the compiler generated MVCL, but coding target = source; (if applicable), or BY-NAME-assignments, the compile generated MVCs etc. Now, with V3.9, the compiler generates the same in the two cases, that is MVCs or MVC loops, so we have no possibility to force the generation of MVCL. AFAIK, my co-workers didn't play with the ARCH options, so far. TUNE is TUNE(2), again AFAIK. I have other projects at the moment, so I had not much time so far to investigate this. But remember: the problem showed up by a Strobe Report, so it seems to be significant. But: if PLIMOVE does no better than a simple assignment, using PLIMOVE seems to make no sense to me. In a certain way, this problem is somewhat different from the first problem in this thread. Robert complained about the optimizer doing a bad job, that is: some instructions are generated that are useless, and others are questionable. But here we have a simple instruction of the HLL (PLIMOVE) which I expect to be implemented using the best instructions the machine provides. If this turns out not to be the case, this is IMHO simply a bug, not only a flaw of the optimizer. The programmer already did some kind of optimization him- or herself, when he or she decided to use PLIMOVE. He or she may well expect that the compiler generates the best available machine instruction for this HLL instruction. Kind regards Bernd Am Mittwoch, 16. Mai 2012 21:41 schrieb Bernd Oppolzer: First, I would like to thank you for starting this thread. I posted it to the performance people of my customer, and they told me, that they just found a similar problem with EP PL/1 3.9, that is: the PLIMOVE calls don't generate MVCLs any more, as in previous releases, but series of MVCs and loops. Even when the length of PLIMOVE is - for example - 8000 bytes. They discovered it, because one of the PLIMOVE locations showed up in a Strobe report. I asked them to test using a ASSEMBLER program, if the MVC loop is faster, but they told me, that even with lengths around 500 or 600, the MVCL solution is faster - this is on a z196. I have still to confirm this. If this turns out to be true, this sounds like a bug, and we will try to convince IBM to go back to the previous solution. If we compile our modules during normal service using EP PL/1 3.9, our system will get slower and slower, because PLIMOVE is widely used. This is not acceptable. Because the PLIMOVEs are generated by a site-specific macro called PLICOPY, I already thought about calling a short ASSEMBLER routine (with minimal linkage conventions) doing the transfer using MVCL instead of CALL PLIMOVE. The applications need not to be changed, because the PLICOPY syntax stays the same. Maybe this could still be faster than doing the MVC loop. Kind regards Bernd -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Comparison of compiler generated code AD 1980(ish) v 2010(ish)
On 2012-05-17 11:14, Bernd Oppolzer wrote: I would like to add: with the previous compiler, CALL PLIMOVE enabled us to force the generation of MVCL. Using, for example CALL PLIMOVE (ADDR (target), ADDR (source), length); the compiler generated MVCL, but coding target = source; (if applicable), or BY-NAME-assignments, the compile generated MVCs etc. Now, with V3.9, the compiler generates the same in the two cases, that is MVCs or MVC loops, so we have no possibility to force the generation of MVCL. AFAIK, my co-workers didn't play with the ARCH options, so far. TUNE is TUNE(2), again AFAIK. The TUNE option has been removed from the V4.1 anyway. I have other projects at the moment, so I had not much time so far to investigate this. But remember: the problem showed up by a Strobe Report, so it seems to be significant. But: if PLIMOVE does no better than a simple assignment, using PLIMOVE seems to make no sense to me. In a certain way, this problem is somewhat different from the first problem in this thread. Robert complained about the optimizer doing a bad job, that is: some instructions are generated that are useless, and others are questionable. But here we have a simple instruction of the HLL (PLIMOVE) which I expect to be implemented using the best instructions the machine provides. If this turns out not to be the case, this is IMHO simply a bug, not only a flaw of the optimizer. The programmer already did some kind of optimization him- or herself, when he or she decided to use PLIMOVE. He or she may well expect that the compiler generates the best available machine instruction for this HLL instruction. You hit the nail right on the head! But I do remember that there was a APAR that explains why the MVCL was removed again. I can't point you to it as the link to the PL/I APARs has gone 404. Finally a note to those following this thread, due to the closure of the gateway between 'bit.listserv.ibm-main' and the list, it is now available in two diverging versions, one here on the list, the other one on news://comp.lang.pl1 very regrettable. Robert -- Robert AH Prins robert(a)prino(d)org -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Comparison of compiler generated code AD 1980(ish) v 2010(ish)
Robert Prins wrote: Can anyone skilled in the art tell me why a compiler that probably dates back to the late 1970'ies or early 1980'ies generates the following short and sweet code for a PL/I BY NAME assignment, while the not completely new (but still fairly recent) version of Enterprise PL/I (V3R9) generates the very, very, very long-winded code below it? I'm not skilled in this art, but is your Enterprise PL/I (v3r9) also using Language Environment or not? Then again, I always thought that the fastest instructions are those ones that are never executed... Those instructions don't need to be optimized... :-) Groete / Greetings Elardus Engelbrecht -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Comparison of compiler generated code AD 1980(ish) v 2010(ish)
Robert, I'm no expert but I have read that newer hardware models (Z10 and above) are essentially RISC processors that run complex instructions in millicode. In the case of a MVC instruction it would have to do that in a loop which would require branching, the enemy of pipelined exeuction units. It's also possible to run simple instructions in parallel. It's plausible an MVC instruction can be executed more efficiently as a sequence of LG/STG instructions. The OOO decode units do this for you with instruction cracking on a z196, it seems that on a z10 the optimizer is doing the same thing. See this document - page 21 http://www-01.ibm.com/software/htp/tpf/tpfug/tgf11/How_do_you_do_when_youre_a_z196_CPU.pdf Optimizers create arcane code. It's almost impossible to verify without understanding the secret sauce. A lot of the code the optimizers spit out is intractable, and it's almost a paradox that a longer code path produces faster code. If you don't like it you can always compile at a different ARCH() level and ask IBM. On 16/05/2012 4:07 AM, Robert Prins wrote: Can anyone skilled in the art tell me why a compiler that probably dates back to the late 1970'ies or early 1980'ies generates the following short and sweet code for a PL/I BY NAME assignment, while the not completely new (but still fairly recent) version of Enterprise PL/I (V3R9) generates the very, very, very long-winded code below it? Or is this (V3R9) code (that predates the OOO z196 architecture) really faster? OS PL/I V2.3.0 - OPT(2) 343 1 2 REPT_LINE= REPT_LIST, BY NAME; * STATEMENT NUMBER 343 002664 58 70 8 268L 7,REPT_WORK.LINE_PTR 002668 58 60 8 030L 6,REPT_WORK.REPT_PTR 00266C 58 F0 3 600L 15,1536(0,3) 002670 D2 03 7 003 F B54 MVC REPT_LINE.TR(4),2900(15) 002676 DE 03 7 003 6 00C EDREPT_LINE.TR(4),REPT_LIST.TR 00267C D2 03 7 00A F B54 MVC REPT_LINE.RE(4),2900(15) 002682 DE 03 7 00A 6 00E EDREPT_LINE.RI(4),REPT_LIST.RI 002688 D2 02 7 011 6 010 MVC REPT_LINE.DA(3),REPT_LIST.DA 00268E 58 E0 3 608L 14,1544(0,3) 002692 D2 06 4 158 E 5D4 MVC 344(7,4),1492(14) 002698 DE 06 4 158 6 014 ED344(7,4),REPT_LIST.K+1 00269E D2 05 7 017 4 159 MVC REPT_LINE.K(6),345(4) 0026A4 D2 06 4 158 E 5D4 MVC 344(7,4),1492(14) 0026AA DE 06 4 158 6 01B ED344(7,4),REPT_LIST.V 0026B0 D2 04 7 028 4 15A MVC REPT_LINE.V(5),346(4) 0026B6 D2 03 7 030 6 026 MVC REPT_LINE.NA(4),REPT_LIST.NA 0026BC D2 03 7 036 6 02A MVC REPT_LINE.TY(4),REPT_LIST.TY 0026C2 D2 03 7 03D 6 02E MVC REPT_LINE.CO(4),REPT_LIST.CO 0026C8 D2 00 7 04B 6 036 MVC REPT_LINE.SP(1),REPT_LIST.SP 0026CE D2 03 7 05F 6 043 MVC REPT_LINE.DATE.YEAR(4),REPT_LIST.DATE.YEAR 0026D4 D2 01 7 064 6 047 MVC REPT_LINE.DATE.MONTH(2),REPT_LIST.DATE.MONTH 0026DA D2 01 7 067 6 049 MVC REPT_LINE.DATE.DAY(2),REPT_LIST.DATE.DAY Enterprise PL/I for z/OS V3.R9.M0 (Built:20100923) - OPT(3) 3120.0 368 1 2 rept_line= rept_list, by name; 003E40 E350 D340 0624 003120 | STG r5,#SPILL33(,r13,25408) 003E46 E320 D270 0624 003120 | STG r2,#SPILL7(,r13,25200) 003E4C E350 D8FD 0571 003120 | LAY r5,_temp9(,r13,22781) 003E52 E300 D368 0604 003120 | LG r0,#SPILL38(,r13,25448) 003E58 E340 D308 0624 003120 | STG r4,#SPILL26(,r13,25352) 003E5E E310 D4B4 0271 003119 | LAY r1,LINE(,r13,9396) 003E64 E300 D8FC 0550 003120 | STY r0,_temp9(,r13,22780) 003E6A E300 D148 0214 003120 | LGF r0,a1:d8520:l4(,r13,8520) 003E70 D278 1000 4D33 003119 | MVC LINE(121,r1,0),REPT_INIT(r4,3379) 003E76 4110 E00C003120 | LA r1,_shadow21(,r14,12) 003E7A E3E0 D8FC 0571 003120 | LAY r14,_temp9(,r13,22780) 003E80 DE03 E000 1000 003120 | ED _temp9(4,r14,0),_shadow21(r1,0) 003E86 B914 00E0003120 | LGFR r14,r0 003E8A E300 D368 0604 003120 | LG r0,#SPILL38(,r13,25448) 003E90 4110 E003003120 | LA r1,#AddressShadow(,r14,3) 003E94 41F0 E00A003120 | LA r15,#AddressShadow(,r14,10) 003E98 D202 1001 5000 003120 | MVC _shadow21(3,r1,1),_temp9(r5,0) 003E9E 9240 E003003120 | MVI _shadow21(r14,3),64 003EA2 E310 DF10 0158 003120 | LY r1,a1:d7952:l4(,r13,7952) 003EA8 E300 D984 0550 003120 | STY r0,_temp8(,r13,22916) 003EAE E350 D984 0571 003120 | LAY r5,_temp8(,r13,22916) 003EB4 4120 E017003120 | LA r2,#AddressShadow(,r14,23) 003EB8 4110 100E003120 | LA r1,_shadow21(,r1,14) 003EBC DE03 5000 1000 003120 | ED _temp8(4,r5,0),_shadow21(r1,0) 003EC2 E310 D985 0571 003120 | LAY r1,_temp8(,r13,22917) 003EC8 4140 E028003120 | LA r4,#AddressShadow(,r14,40) 003ECC D202 F001 1000 003120 | MVC _shadow21(3,r15,1),_temp8(r1,0) 003ED2 9240 E00A003120 | MVI _shadow21(r14,10),64 003ED6 E310 DF10 0158 003120 | LY r1,a1:d7952:l4(,r13,7952) 003EDC E3F0 D974 0571 003120 | LAY r15,_temp19(,r13,22900)
Re: Comparison of compiler generated code AD 1980(ish) v 2010(ish)
On 2012-05-16 07:26, Elardus Engelbrecht wrote: Robert Prins wrote: Can anyone skilled in the art tell me why a compiler that probably dates back to the late 1970'ies or early 1980'ies generates the following short and sweet code for a PL/I BY NAME assignment, while the not completely new (but still fairly recent) version of Enterprise PL/I (V3R9) generates the very, very, very long-winded code below it? I'm not skilled in this art, but is your Enterprise PL/I (v3r9) also using Language Environment or not? Yes, it is. Then again, I always thought that the fastest instructions are those ones that are never executed... Those instructions don't need to be optimized... :-) Exactly! Robert -- Robert AH Prins robert(a)prino(d)org -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Comparison of compiler generated code AD 1980(ish) v 2010(ish)
On 5/16/2012 1:26 AM, Elardus Engelbrecht wrote: Robert Prins wrote: Can anyone skilled in the art tell me why a compiler that probably dates back to the late 1970'ies or early 1980'ies generates the following short and sweet code for a PL/I BY NAME assignment, while the not completely new (but still fairly recent) version of Enterprise PL/I (V3R9) generates the very, very, very long-winded code below it? I'm not skilled in this art, but is your Enterprise PL/I (v3r9) also using Language Environment or not? He has no choice on this: all the new compilers _must_ use LE. Then again, I always thought that the fastest instructions are those ones that are never executed... Those instructions don't need to be optimized... :-) Groete / Greetings Elardus Engelbrecht -- Kind regards, -Steve Comstock The Trainer's Friend, Inc. 303-355-2752 http://www.trainersfriend.com * To get a good Return on your Investment, first make an investment! + Training your people is an excellent investment * Try our tool for calculating your Return On Investment for training dollars at http://www.trainersfriend.com/ROI/roi.html -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Comparison of compiler generated code AD 1980(ish) v 2010(ish)
On Wed, 16 May 2012 06:41:27 -0600, Steve Comstock wrote: He has no choice on this: all the new compilers _must_ use LE. Even Metal C? -- gil -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Comparison of compiler generated code AD 1980(ish) v 2010(ish)
Metal C does NOT use LE. And, of course, with HLASM you have the choice to not use LE or to make your program LE compatible. Lloyd - Original Message From: Paul Gilmartin paulgboul...@aim.com To: IBM-MAIN@bama.ua.edu Sent: Wed, May 16, 2012 9:12:24 AM Subject: Re: Comparison of compiler generated code AD 1980(ish) v 2010(ish) On Wed, 16 May 2012 06:41:27 -0600, Steve Comstock wrote: He has no choice on this: all the new compilers _must_ use LE. Even Metal C? -- gil -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Comparison of compiler generated code AD 1980(ish) v 2010(ish)
On 5/16/2012 7:11 AM, Paul Gilmartin wrote: On Wed, 16 May 2012 06:41:27 -0600, Steve Comstock wrote: He has no choice on this: all the new compilers _must_ use LE. Even Metal C? -- gil Well, I knew someone would raise that exception. No, Metal C does not use LE. Not sure if SP C (Systems Programmer C) is still around and it would be an exception too. -- Kind regards, -Steve Comstock The Trainer's Friend, Inc. 303-355-2752 http://www.trainersfriend.com * To get a good Return on your Investment, first make an investment! + Training your people is an excellent investment * Try our tool for calculating your Return On Investment for training dollars at http://www.trainersfriend.com/ROI/roi.html -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Comparison of compiler generated code AD 1980(ish) v 2010(ish)
On Wed, 16 May 2012 07:55:48 -0600, Steve Comstock wrote: Well, I knew someone would raise that exception. No, Metal C does not use LE. Not sure if SP C (Systems Programmer C) is still around and it would be an exception too. I believe it's been discussed in these fora that C and PL/I share an optimizer/code generator. I hope this includes Metal C. It's a long leap of logic, but that might weaken the argument for LE entanglement. Is MOVE, BY NAME plausibly dependent on LE? -- gil -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Comparison of compiler generated code AD 1980(ish) v 2010(ish)
David, On 2012-05-16 08:23, David Crayford wrote: Robert, I'm no expert but I have read that newer hardware models (Z10 and above) are essentially RISC processors that run complex instructions in millicode. In the I may be wrong, but I think the z196 is the first OOO machine and Enterprise PL/I V3R9 pre-dates it by two years. case of a MVC instruction it would have to do that in a loop which would require branching, the enemy of pipelined exeuction units. It's also possible to run simple instructions in parallel. It's plausible an MVC instruction can be executed more efficiently as a sequence of LG/STG instructions. Given that moves are the most executed instructions, at least on x86, (see, among many others www.ijpg.org/index.php/IJACSci/article/download/118/29) and I have little doubt that the same holds true for about any other architecture and that there is special x86 circuitry to optimize MOVS instructions, it would be highly surprising if IBM did not make MVC as fast as possible, millicoded or not. The OOO decode units do this for you with instruction cracking on a z196, it seems that on a z10 the optimizer is doing the same thing. Possibly, but that does not explain the 10 superfluous reloads of r1. See this document - page 21 http://www-01.ibm.com/software/htp/tpf/tpfug/tgf11/How_do_you_do_when_youre_a_z196_CPU.pdf Optimizers create arcane code. It's almost impossible to verify without understanding the secret sauce. A lot of the code the optimizers spit out is intractable, I don't know much about z/OS assembler, but at least I sort of managed to understand the code generated by the OS PL/I compiler. The code generated by Enterprise PL/I is completely unreadable, even some (or more than some) on this list might have trouble figuring out why it does what it does. and it's almost a paradox that a longer code path produces faster code. If you don't like it you can always compile at a different ARCH() level and ask IBM. Going back to ARCH(5) doesn't produce anything that seems much shorter, still the ridiculous reloading of the same register, and oodles and oodles instructions which would run and take time on a definitely not-OOO CPU: 003A58 E300 8238 0014 003119 | LGF r0,LINE_PTR(,r8,568) 003A5E 4110 E00C003119 | LAr1,_shadow21(,r14,12) 003A62 B914 00E0003119 | LGFR r14,r0 003A66 D278 B38E 6D33 003118 | MVC LINE(121,r11,910),REPT_INIT(r6,3379) 003A6C E3B0 DC20 0004 003119 | LGr11,#SPILL17(,r13,3104) 003A72 50B0 D25C003119 | STr11,_temp9(,r13,604) 003A76 DE03 D25C 1000 003119 | ED_temp9(4,r13,604),_shadow21(r1,0) 003A7C 4110 E003003119 | LAr1,#AddressShadow(,r14,3) 003A80 41F0 E00A003119 | LAr15,#AddressShadow(,r14,10) 003A84 D202 1001 D25D 003119 | MVC _shadow21(3,r1,1),_temp9(r13,605) 003A8A 9240 E003003119 | MVI _shadow21(r14,3),64 003A8E 5810 8000003119 | L r1,REPT_PTR(,r8,0) 003A92 50B0 D2E4003119 | STr11,_temp8(,r13,740) 003A96 41B0 E017003119 | LAr11,#AddressShadow(,r14,23) 003A9A 4110 100E003119 | LAr1,_shadow21(,r1,14) 003A9E DE03 D2E4 1000 003119 | ED_temp8(4,r13,740),_shadow21(r1,0) 003AA4 D202 F001 D2E5 003119 | MVC _shadow21(3,r15,1),_temp8(r13,741) 003AAA 9240 E00A003119 | MVI _shadow21(r14,10),64 003AAE 5810 8000003119 | L r1,REPT_PTR(,r8,0) 003AB2 E3F0 DB98 0004 003119 | LGr15,#SPILL0(,r13,2968) 003AB8 D202 E011 1010 003119 | MVC _shadow21(3,r14,17),_shadow21(r1,16) 003ABE 5810 8000003119 | L r1,REPT_PTR(,r8,0) 003AC2 D206 D2D4 F4A4 003119 | MVC _temp19(7,r13,724),' ..'(r15,1188) 003AC8 D203 D26C 1013 003119 | MVC _temp15(4,r13,620),_shadow18(r1,19) 003ACE 4110 D26C003119 | LAr1,_temp15(,r13,620) 003AD2 D202 D24C 1001 003119 | MVC _temp11(3,r13,588),_shadow12(r1,1) 003AD8 4110 D24C003119 | LAr1,_temp11(,r13,588) 003ADC DE06 D2D4 1000 003119 | ED_temp19(7,r13,724),_temp11(r1,0) 003AE2 D205 B000 D2D5 003119 | MVC _shadow21(6,r11,0),_temp19(r13,725) 003AE8 5810 8000003119 | L r1,REPT_PTR(,r8,0) 003AEC D206 D2CC F4A4 003119 | MVC _temp21(7,r13,716),' ..'(r15,1188) 003AF2 D202 D249 101B 003119 | MVC _temp18(3,r13,585),_shadow12(r1,27) 003AF8 D202 D246 D249 003119 | MVC _temp20(3,r13,582),_temp18(r13,585) 003AFE 4110 E028003119 | LAr1,#AddressShadow(,r14,40) 003B02 E300 D246 0090 003119 | LLGC r0,a1:d582:l1(,r13,582) 003B08 E300 3114 0080 003119 | NGr0,=X' 000F' 003B0E 41B0 D246003119 | LAr11,_temp20(,r13,582) 003B12 4200 D246003119 | STC r0,a1:d582:l1(,r13,582) 003B16 DE06 D2CC B000 003119 | ED_temp21(7,r13,716),_temp20(r11,0) 003B1C D204 1000 D2CE 003119 | MVC _shadow21(5,r1,0),_temp21(r13,718) 003B22 5810 8000003119 | L
Re: Comparison of compiler generated code AD 1980(ish) v 2010(ish)
On 2012-05-16 14:59, Paul Gilmartin wrote: On Wed, 16 May 2012 07:55:48 -0600, Steve Comstock wrote: Well, I knew someone would raise that exception. No, Metal C does not use LE. Not sure if SP C (Systems Programmer C) is still around and it would be an exception too. I believe it's been discussed in these fora that C and PL/I share an optimizer/code generator. I hope this includes Metal C. It's a long leap of logic, but that might weaken the argument for LE entanglement. Is MOVE, BY NAME plausibly dependent on LE? For PL/I is is most definitely not, it's just a shortcut for lazy people and I've worked at sites that explicitly forbade its use, considering it just as bad as a SELECT * in SQL. Robert -- Robert AH Prins robert(a)prino(d)org -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Comparison of compiler generated code AD 1980(ish) v 2010(ish)
Hi Do you have the chance to compare the speed of the two codes ? David, On 2012-05-16 08:23, David Crayford wrote: Robert, I'm no expert but I have read that newer hardware models (Z10 and above) are essentially RISC processors that run complex instructions in millicode. In the I may be wrong, but I think the z196 is the first OOO machine and Enterprise PL/I V3R9 pre-dates it by two years. case of a MVC instruction it would have to do that in a loop which would require branching, the enemy of pipelined exeuction units. It's also possible to run simple instructions in parallel. It's plausible an MVC instruction can be executed more efficiently as a sequence of LG/STG instructions. Given that moves are the most executed instructions, at least on x86, (see, among many others www.ijpg.org/index.php/IJACSci/article/download/118/29) and I have little doubt that the same holds true for about any other architecture and that there is special x86 circuitry to optimize MOVS instructions, it would be highly surprising if IBM did not make MVC as fast as possible, millicoded or not. The OOO decode units do this for you with instruction cracking on a z196, it seems that on a z10 the optimizer is doing the same thing. Possibly, but that does not explain the 10 superfluous reloads of r1. See this document - page 21 http://www-01.ibm.com/software/htp/tpf/tpfug/tgf11/How_do_you_do_when_youre_a_z196_CPU.pdf Optimizers create arcane code. It's almost impossible to verify without understanding the secret sauce. A lot of the code the optimizers spit out is intractable, I don't know much about z/OS assembler, but at least I sort of managed to understand the code generated by the OS PL/I compiler. The code generated by Enterprise PL/I is completely unreadable, even some (or more than some) on this list might have trouble figuring out why it does what it does. and it's almost a paradox that a longer code path produces faster code. If you don't like it you can always compile at a different ARCH() level and ask IBM. Going back to ARCH(5) doesn't produce anything that seems much shorter, still the ridiculous reloading of the same register, and oodles and oodles instructions which would run and take time on a definitely not-OOO CPU: 003A58 E300 8238 0014 003119 | LGF r0,LINE_PTR(,r8,568) 003A5E 4110 E00C003119 | LAr1,_shadow21(,r14,12) 003A62 B914 00E0003119 | LGFR r14,r0 003A66 D278 B38E 6D33 003118 | MVC LINE(121,r11,910),REPT_INIT(r6,3379) 003A6C E3B0 DC20 0004 003119 | LGr11,#SPILL17(,r13,3104) 003A72 50B0 D25C003119 | STr11,_temp9(,r13,604) 003A76 DE03 D25C 1000 003119 | ED_temp9(4,r13,604),_shadow21(r1,0) 003A7C 4110 E003003119 | LAr1,#AddressShadow(,r14,3) 003A80 41F0 E00A003119 | LAr15,#AddressShadow(,r14,10) 003A84 D202 1001 D25D 003119 | MVC _shadow21(3,r1,1),_temp9(r13,605) 003A8A 9240 E003003119 | MVI _shadow21(r14,3),64 003A8E 5810 8000003119 | L r1,REPT_PTR(,r8,0) 003A92 50B0 D2E4003119 | STr11,_temp8(,r13,740) 003A96 41B0 E017003119 | LAr11,#AddressShadow(,r14,23) 003A9A 4110 100E003119 | LAr1,_shadow21(,r1,14) 003A9E DE03 D2E4 1000 003119 | ED_temp8(4,r13,740),_shadow21(r1,0) 003AA4 D202 F001 D2E5 003119 | MVC _shadow21(3,r15,1),_temp8(r13,741) 003AAA 9240 E00A003119 | MVI _shadow21(r14,10),64 003AAE 5810 8000003119 | L r1,REPT_PTR(,r8,0) 003AB2 E3F0 DB98 0004 003119 | LGr15,#SPILL0(,r13,2968) 003AB8 D202 E011 1010 003119 | MVC _shadow21(3,r14,17),_shadow21(r1,16) 003ABE 5810 8000003119 | L r1,REPT_PTR(,r8,0) 003AC2 D206 D2D4 F4A4 003119 | MVC _temp19(7,r13,724),' ..'(r15,1188) 003AC8 D203 D26C 1013 003119 | MVC _temp15(4,r13,620),_shadow18(r1,19) 003ACE 4110 D26C003119 | LAr1,_temp15(,r13,620) 003AD2 D202 D24C 1001 003119 | MVC _temp11(3,r13,588),_shadow12(r1,1) 003AD8 4110 D24C003119 | LAr1,_temp11(,r13,588) 003ADC DE06 D2D4 1000 003119 | ED_temp19(7,r13,724),_temp11(r1,0) 003AE2 D205 B000 D2D5 003119 | MVC _shadow21(6,r11,0),_temp19(r13,725) 003AE8 5810 8000003119 | L r1,REPT_PTR(,r8,0) 003AEC D206 D2CC F4A4 003119 | MVC _temp21(7,r13,716),' ..'(r15,1188) 003AF2 D202 D249 101B 003119 | MVC _temp18(3,r13,585),_shadow12(r1,27) 003AF8 D202 D246 D249 003119 | MVC _temp20(3,r13,582),_temp18(r13,585) 003AFE 4110 E028003119 | LAr1,#AddressShadow(,r14,40) 003B02 E300 D246 0090 003119 | LLGC r0,a1:d582:l1(,r13,582) 003B08 E300 3114 0080 003119 | NGr0,=X' 000F' 003B0E 41B0 D246003119 | LAr11,_temp20(,r13,582) 003B12 4200 D246003119 | STC r0,a1:d582:l1(,r13,582) 003B16 DE06 D2CC B000
Re: Comparison of compiler generated code AD 1980(ish) v 2010(ish)
On Wed, 16 May 2012 17:21:25 +0200, Miklos Szigetvari wrote: Do you have the chance to compare the speed of the two codes ? Does execution speed always trump code size? Where should the tradeoff be? For example, any loop with a fixed number of iterations (even a million) could be flattened to linear code; fewer instructions executed; no test and branch. (But it might yet be slower because of instruction cache faults.) -- gil -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Comparison of compiler generated code AD 1980(ish) v 2010(ish)
On Tue, 15 May 2012 20:07:52 +, Robert Prins wrote: maybe a 16-byte three-instruction sequence like 003FC0 E310 DF10 0158 003120 | LY r1,a1:d7952:l4(,r13,7952) 003FC6 E300 1047 0015 003120 | LGH r0,_shadow20(,r1,71) 003FCC 4000 E064003120 | STH r0,_shadow20(,r14,100) is really faster than the simple 6-byte one-instruction sequence 0026D4 D2 01 7 064 6 047 MVC REPT_LINE.DATE.MONTH(2),REPT_LIST.DATE.MONTH Not likely. Address Generation Interlock (AGI) will cause the second instruction to stall until the address is available in R1. In addition, instruction cracking will, under some circumstances, cause a z196 processor to execute a load and a store when a MVC instruction is executed. -- Tom Marchant -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Comparison of compiler generated code AD 1980(ish) v 2010(ish)
First, I would like to thank you for starting this thread. I posted it to the performance people of my customer, and they told me, that they just found a similar problem with EP PL/1 3.9, that is: the PLIMOVE calls don't generate MVCLs any more, as in previous releases, but series of MVCs and loops. Even when the length of PLIMOVE is - for example - 8000 bytes. They discovered it, because one of the PLIMOVE locations showed up in a Strobe report. I asked them to test using a ASSEMBLER program, if the MVC loop is faster, but they told me, that even with lengths around 500 or 600, the MVCL solution is faster - this is on a z196. I have still to confirm this. If this turns out to be true, this sounds like a bug, and we will try to convince IBM to go back to the previous solution. If we compile our modules during normal service using EP PL/1 3.9, our system will get slower and slower, because PLIMOVE is widely used. This is not acceptable. Because the PLIMOVEs are generated by a site-specific macro called PLICOPY, I already thought about calling a short ASSEMBLER routine (with minimal linkage conventions) doing the transfer using MVCL instead of CALL PLIMOVE. The applications need not to be changed, because the PLICOPY syntax stays the same. Maybe this could still be faster than doing the MVC loop. Kind regards Bernd Am 16.05.2012 19:05, schrieb Robert Prins: On 2012-05-16 14:59, Paul Gilmartin wrote: On Wed, 16 May 2012 07:55:48 -0600, Steve Comstock wrote: Well, I knew someone would raise that exception. No, Metal C does not use LE. Not sure if SP C (Systems Programmer C) is still around and it would be an exception too. I believe it's been discussed in these fora that C and PL/I share an optimizer/code generator. I hope this includes Metal C. It's a long leap of logic, but that might weaken the argument for LE entanglement. Is MOVE, BY NAME plausibly dependent on LE? For PL/I is is most definitely not, it's just a shortcut for lazy people and I've worked at sites that explicitly forbade its use, considering it just as bad as a SELECT * in SQL. Robert -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Comparison of compiler generated code AD 1980(ish) v 2010(ish)
The Hercules group did some testing comparing MVCL to MVC. If both source and destination had the same alignment to a double word boundary, you could move 8 bytes, then increment the 4 registers to reflect this, before being interuptable. If they aligned differently between boundaries, each 8 bytes would do this twice. Whereas an MVC would do 256 bytes or less without interupting or touching registers, and was much faster. Of course, emulation is much different from hardware, ie, updating all 4 registers at once. On Wed, May 16, 2012 at 2:41 PM, Bernd Oppolzer bernd.oppol...@t-online.de wrote: First, I would like to thank you for starting this thread. I posted it to the performance people of my customer, and they told me, that they just found a similar problem with EP PL/1 3.9, that is: the PLIMOVE calls don't generate MVCLs any more, as in previous releases, but series of MVCs and loops. Even when the length of PLIMOVE is - for example - 8000 bytes. They discovered it, because one of the PLIMOVE locations showed up in a Strobe report. I asked them to test using a ASSEMBLER program, if the MVC loop is faster, but they told me, that even with lengths around 500 or 600, the MVCL solution is faster - this is on a z196. I have still to confirm this. If this turns out to be true, this sounds like a bug, and we will try to convince IBM to go back to the previous solution. If we compile our modules during normal service using EP PL/1 3.9, our system will get slower and slower, because PLIMOVE is widely used. This is not acceptable. Because the PLIMOVEs are generated by a site-specific macro called PLICOPY, I already thought about calling a short ASSEMBLER routine (with minimal linkage conventions) doing the transfer using MVCL instead of CALL PLIMOVE. The applications need not to be changed, because the PLICOPY syntax stays the same. Maybe this could still be faster than doing the MVC loop. Kind regards Bernd -- Mike A Schwab, Springfield IL USA Where do Forest Rangers go to get away from it all? -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Comparison of compiler generated code AD 1980(ish) v 2010(ish)
Guys, A little lost on the point of this thread , not trying be rude or flippant , just I see a lot about performance, is it valid with today's hardware and software, yes there is time sensitive events and software and hardware. Scott ford www.identityforge.com On May 16, 2012, at 4:35 PM, Mike Schwab mike.a.sch...@gmail.com wrote: The Hercules group did some testing comparing MVCL to MVC. If both source and destination had the same alignment to a double word boundary, you could move 8 bytes, then increment the 4 registers to reflect this, before being interuptable. If they aligned differently between boundaries, each 8 bytes would do this twice. Whereas an MVC would do 256 bytes or less without interupting or touching registers, and was much faster. Of course, emulation is much different from hardware, ie, updating all 4 registers at once. On Wed, May 16, 2012 at 2:41 PM, Bernd Oppolzer bernd.oppol...@t-online.de wrote: First, I would like to thank you for starting this thread. I posted it to the performance people of my customer, and they told me, that they just found a similar problem with EP PL/1 3.9, that is: the PLIMOVE calls don't generate MVCLs any more, as in previous releases, but series of MVCs and loops. Even when the length of PLIMOVE is - for example - 8000 bytes. They discovered it, because one of the PLIMOVE locations showed up in a Strobe report. I asked them to test using a ASSEMBLER program, if the MVC loop is faster, but they told me, that even with lengths around 500 or 600, the MVCL solution is faster - this is on a z196. I have still to confirm this. If this turns out to be true, this sounds like a bug, and we will try to convince IBM to go back to the previous solution. If we compile our modules during normal service using EP PL/1 3.9, our system will get slower and slower, because PLIMOVE is widely used. This is not acceptable. Because the PLIMOVEs are generated by a site-specific macro called PLICOPY, I already thought about calling a short ASSEMBLER routine (with minimal linkage conventions) doing the transfer using MVCL instead of CALL PLIMOVE. The applications need not to be changed, because the PLICOPY syntax stays the same. Maybe this could still be faster than doing the MVC loop. Kind regards Bernd -- Mike A Schwab, Springfield IL USA Where do Forest Rangers go to get away from it all? -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Comparison of compiler generated code AD 1980(ish) v 2010(ish)
Oh, now i see what Bernd was talking about, sorry guys , old age Scott ford www.identityforge.com On May 16, 2012, at 4:53 PM, Scott Ford scott_j_f...@yahoo.com wrote: Guys, A little lost on the point of this thread , not trying be rude or flippant , just I see a lot about performance, is it valid with today's hardware and software, yes there is time sensitive events and software and hardware. Scott ford www.identityforge.com On May 16, 2012, at 4:35 PM, Mike Schwab mike.a.sch...@gmail.com wrote: The Hercules group did some testing comparing MVCL to MVC. If both source and destination had the same alignment to a double word boundary, you could move 8 bytes, then increment the 4 registers to reflect this, before being interuptable. If they aligned differently between boundaries, each 8 bytes would do this twice. Whereas an MVC would do 256 bytes or less without interupting or touching registers, and was much faster. Of course, emulation is much different from hardware, ie, updating all 4 registers at once. On Wed, May 16, 2012 at 2:41 PM, Bernd Oppolzer bernd.oppol...@t-online.de wrote: First, I would like to thank you for starting this thread. I posted it to the performance people of my customer, and they told me, that they just found a similar problem with EP PL/1 3.9, that is: the PLIMOVE calls don't generate MVCLs any more, as in previous releases, but series of MVCs and loops. Even when the length of PLIMOVE is - for example - 8000 bytes. They discovered it, because one of the PLIMOVE locations showed up in a Strobe report. I asked them to test using a ASSEMBLER program, if the MVC loop is faster, but they told me, that even with lengths around 500 or 600, the MVCL solution is faster - this is on a z196. I have still to confirm this. If this turns out to be true, this sounds like a bug, and we will try to convince IBM to go back to the previous solution. If we compile our modules during normal service using EP PL/1 3.9, our system will get slower and slower, because PLIMOVE is widely used. This is not acceptable. Because the PLIMOVEs are generated by a site-specific macro called PLICOPY, I already thought about calling a short ASSEMBLER routine (with minimal linkage conventions) doing the transfer using MVCL instead of CALL PLIMOVE. The applications need not to be changed, because the PLICOPY syntax stays the same. Maybe this could still be faster than doing the MVC loop. Kind regards Bernd -- Mike A Schwab, Springfield IL USA Where do Forest Rangers go to get away from it all? -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Comparison of compiler generated code AD 1980(ish) v 2010(ish)
Can anyone skilled in the art tell me why a compiler that probably dates back to the late 1970'ies or early 1980'ies generates the following short and sweet code for a PL/I BY NAME assignment, while the not completely new (but still fairly recent) version of Enterprise PL/I (V3R9) generates the very, very, very long-winded code below it? Or is this (V3R9) code (that predates the OOO z196 architecture) really faster? OS PL/I V2.3.0 - OPT(2) 343 1 2 REPT_LINE= REPT_LIST, BY NAME; * STATEMENT NUMBER 343 002664 58 70 8 268L 7,REPT_WORK.LINE_PTR 002668 58 60 8 030L 6,REPT_WORK.REPT_PTR 00266C 58 F0 3 600L 15,1536(0,3) 002670 D2 03 7 003 F B54 MVC REPT_LINE.TR(4),2900(15) 002676 DE 03 7 003 6 00C EDREPT_LINE.TR(4),REPT_LIST.TR 00267C D2 03 7 00A F B54 MVC REPT_LINE.RE(4),2900(15) 002682 DE 03 7 00A 6 00E EDREPT_LINE.RI(4),REPT_LIST.RI 002688 D2 02 7 011 6 010 MVC REPT_LINE.DA(3),REPT_LIST.DA 00268E 58 E0 3 608L 14,1544(0,3) 002692 D2 06 4 158 E 5D4 MVC 344(7,4),1492(14) 002698 DE 06 4 158 6 014 ED344(7,4),REPT_LIST.K+1 00269E D2 05 7 017 4 159 MVC REPT_LINE.K(6),345(4) 0026A4 D2 06 4 158 E 5D4 MVC 344(7,4),1492(14) 0026AA DE 06 4 158 6 01B ED344(7,4),REPT_LIST.V 0026B0 D2 04 7 028 4 15A MVC REPT_LINE.V(5),346(4) 0026B6 D2 03 7 030 6 026 MVC REPT_LINE.NA(4),REPT_LIST.NA 0026BC D2 03 7 036 6 02A MVC REPT_LINE.TY(4),REPT_LIST.TY 0026C2 D2 03 7 03D 6 02E MVC REPT_LINE.CO(4),REPT_LIST.CO 0026C8 D2 00 7 04B 6 036 MVC REPT_LINE.SP(1),REPT_LIST.SP 0026CE D2 03 7 05F 6 043 MVC REPT_LINE.DATE.YEAR(4),REPT_LIST.DATE.YEAR 0026D4 D2 01 7 064 6 047 MVC REPT_LINE.DATE.MONTH(2),REPT_LIST.DATE.MONTH 0026DA D2 01 7 067 6 049 MVC REPT_LINE.DATE.DAY(2),REPT_LIST.DATE.DAY Enterprise PL/I for z/OS V3.R9.M0 (Built:20100923) - OPT(3) 3120.0 368 1 2 rept_line= rept_list, by name; 003E40 E350 D340 0624 003120 | STG r5,#SPILL33(,r13,25408) 003E46 E320 D270 0624 003120 | STG r2,#SPILL7(,r13,25200) 003E4C E350 D8FD 0571 003120 | LAY r5,_temp9(,r13,22781) 003E52 E300 D368 0604 003120 | LG r0,#SPILL38(,r13,25448) 003E58 E340 D308 0624 003120 | STG r4,#SPILL26(,r13,25352) 003E5E E310 D4B4 0271 003119 | LAY r1,LINE(,r13,9396) 003E64 E300 D8FC 0550 003120 | STY r0,_temp9(,r13,22780) 003E6A E300 D148 0214 003120 | LGF r0,a1:d8520:l4(,r13,8520) 003E70 D278 1000 4D33 003119 | MVC LINE(121,r1,0),REPT_INIT(r4,3379) 003E76 4110 E00C003120 | LA r1,_shadow21(,r14,12) 003E7A E3E0 D8FC 0571 003120 | LAY r14,_temp9(,r13,22780) 003E80 DE03 E000 1000 003120 | ED _temp9(4,r14,0),_shadow21(r1,0) 003E86 B914 00E0003120 | LGFR r14,r0 003E8A E300 D368 0604 003120 | LG r0,#SPILL38(,r13,25448) 003E90 4110 E003003120 | LA r1,#AddressShadow(,r14,3) 003E94 41F0 E00A003120 | LA r15,#AddressShadow(,r14,10) 003E98 D202 1001 5000 003120 | MVC _shadow21(3,r1,1),_temp9(r5,0) 003E9E 9240 E003003120 | MVI _shadow21(r14,3),64 003EA2 E310 DF10 0158 003120 | LY r1,a1:d7952:l4(,r13,7952) 003EA8 E300 D984 0550 003120 | STY r0,_temp8(,r13,22916) 003EAE E350 D984 0571 003120 | LAY r5,_temp8(,r13,22916) 003EB4 4120 E017003120 | LA r2,#AddressShadow(,r14,23) 003EB8 4110 100E003120 | LA r1,_shadow21(,r1,14) 003EBC DE03 5000 1000 003120 | ED _temp8(4,r5,0),_shadow21(r1,0) 003EC2 E310 D985 0571 003120 | LAY r1,_temp8(,r13,22917) 003EC8 4140 E028003120 | LA r4,#AddressShadow(,r14,40) 003ECC D202 F001 1000 003120 | MVC _shadow21(3,r15,1),_temp8(r1,0) 003ED2 9240 E00A003120 | MVI _shadow21(r14,10),64 003ED6 E310 DF10 0158 003120 | LY r1,a1:d7952:l4(,r13,7952) 003EDC E3F0 D974 0571 003120 | LAY r15,_temp19(,r13,22900) 003EE2 D202 E011 1010 003120 | MVC _shadow21(3,r14,17),_shadow21(r1,16) 003EE8 E310 D238 0604 003120 | LG r1,#SPILL0(,r13,25144) 003EEE D206 F000 14A4 003120 | MVC _temp19(7,r15,0),' ..'(r1,1188) 003EF4 E310 DF10 0158 003120 | LY r1,a1:d7952:l4(,r13,7952) 003EFA D203 B95C 1013 003120 | MVC _temp15(4,r11,2396),_shadow18(r1,19) 003F00 E310 D90C 0571 003120 | LAY r1,_temp15(,r13,22796) 003F06 D202 B93C 1001 003120 | MVC _temp11(3,r11,2364),_shadow12(r1,1) 003F0C E310 D8EC 0571 003120 | LAY r1,_temp11(,r13,22764) 003F12 DE06 F000 1000 003120 | ED _temp19(7,r15,0),_temp11(r1,0) 003F18 E310 D975 0571 003120 | LAY r1,_temp19(,r13,22901) 003F1E D205 2000 1000 003120 | MVC _shadow21(6,r2,0),_temp19(r1,0) 003F24 E310 D238 0604 003120 | LG r1,#SPILL0(,r13,25144) 003F2A E320 D96C 0571 003120 | LAY r2,_temp21(,r13,22892) 003F30 D206 2000 14A4 003120 | MVC _temp21(7,r2,0),' ..'(r1,1188) 003F36 E310 DF10 0158 003120 | LY r1,a1:d7952:l4(,r13,7952) 003F3C D202 B939 101B 003120 | MVC