Re: Avoiding SIIS - (Was Base-less macros)
Wendell wrote: > From Tony's explanation concerning moving the TR command to the data area via LOCTR, can I surmise that it might be counter-productive to do so--that leaving the TR and EX commands in consecutive memory is more efficient? My $0.02 is that I'm not so sure about the "do it once with length of 1, then do it for real with EX" approach, but what is mentioned in the Kevin Shum paper that Tony referenced is that you want the target of the EX instruction "close" to the EX itself, so that the target instruction might already be in the i-cache and won't have to be fetched. I usually try to place it like this: SUBROUTINE_1DS0H ... EXR2,EX_TARGET ... BRR14 . EX_TARGET MVC 0(*-*,R3),0(R4) . SUBROUTINE_2DS0H ... so that the EX target isn't in a data area, but is "close" to the EX itself. == Adam Johanson R Software Engineer adam.johan...@broadcom.com -- This electronic communication and the information and any files transmitted with it, or attached to it, are confidential and are intended solely for the use of the individual or entity to whom it is addressed and may contain information that is confidential, legally privileged, protected by privacy laws, or otherwise restricted from disclosure to anyone else. If you are not the intended recipient or the person responsible for delivering the e-mail to the intended recipient, you are hereby notified that any use, copying, distributing, dissemination, forwarding, printing, or copying of this e-mail is strictly prohibited. If you received this e-mail in error, please return the e-mail to the sender, delete it from your computer, and destroy any printed copy of it.
Re: Avoiding SIIS - (Was Base-less macros)
Sorry, senior moment. But it will do it twice, and the double traslate of the the first byte is probably not what you want. -- Shmuel (Seymour J.) Metz http://mason.gmu.edu/~smetz3 From: IBM Mainframe Assembler List [ASSEMBLER-LIST@LISTSERV.UGA.EDU] on behalf of Keith Moe [ke...@sbcglobal.net] Sent: Thursday, November 11, 2021 4:51 PM To: ASSEMBLER-LIST@LISTSERV.UGA.EDU Subject: Re: Avoiding SIIS - (Was Base-less macros) Actually the inline TR/EX will do the TR the first time for ONE byte, not 256, followed by the EX of the specified length. Keith Moe BMC Software On Thursday, November 11, 2021, 01:44:07 PM PST, Seymour J Metz wrote: There are bigger problems than cache in that example; the EX/TR will translate twice, the first time with a length of 256. -- Shmuel (Seymour J.) Metz http://mason.gmu.edu/~smetz3 From: IBM Mainframe Assembler List [ASSEMBLER-LIST@LISTSERV.UGA.EDU] on behalf of Tony Harminc [t...@harminc.com] Sent: Wednesday, November 10, 2021 11:53 PM To: ASSEMBLER-LIST@LISTSERV.UGA.EDU Subject: Re: Avoiding SIIS - (Was Base-less macros) On Wed, 10 Nov 2021 at 11:45, Wendell Lovewell <09624390d784-dmarc-requ...@listserv.uga.edu> wrote: > > I'm reluctant to admit this, but I'm still unclear about SIIS issues. Could > someone please explain what happens to the D- and I-cache in these situations? I can tell you my understanding, but it's certainly not definitive. I just read the books that the people who really know write. Most probably none of these examples does a SIIS, but there isn't enough information to be 100% sure. I will assume that none of R14 in examples 1 and 2, or R5 in example 3, points anywhere near the code you are showing. I will also assume that a cache line is 256 bytes, as it is on all recent machines. (You can use the unprivileged ECAG instruction to ask the machine what the I- and D- cache line sizes are. But of course you can't take that run-time information and turn it back into input to the assembler. A Just In Time compiler, e.g. for Java could do and probably does do that.) > Example 1: > TxtTRNullTR0(*-*,R14),NoNulls > EX R5,*-6 No SIIS. The fetch of the target of EX/EXRL is defined to be an instruction fetch, so chances are the TR and the EX are in the same I-cache line, in which case that is the only I-cache line in use locally. If the TR happens to end on a cache line boundary, then fetching the EX will bring in another I-cache line. The line containing the TR is unlikely to be discarded, because it was just used. The two operands of TR could involve as many as two D-cache lines each, or as few as one in total, depending on where the operands lie. > Example 2: > PgmConst LOCTR , > TxtTRNullTR0(*-*,R14),NoNulls > PgmCodeLOCTR , > EX R5,TxtTRNull No SIIS. Almost the same as example 1, but the TR and EX are more likely to be in different cache lines because they're more likely to be further apart. You don't show it, but if the PgmConst area contains data as well as the TR instruction, then referencing that data will bring it into a D-cache line. It's not wrong to have this situation, and any performance hit should come only from the fact of having the same bytes in two cache lines, and therefore excluding some other info. I don't believe there is any direct interaction as long as nobody stores into either area. But note that that includes any code that stores into any part of the cache line, and that in turn includes code that may not be executed in a given case but that has been fetched and analysed as part of branch prediction. > Example 3: > GENCB BLK=ACB,AM=VSAM,MACRF=(KEY,DIR,SEQ,IN), > LOC=ANY,RMODE31=ALL, > MF=(G,(R5),GENCBLN) > +GENCBLN EQU 56LENGTH OF PARM LIST AREA USED > +CNOP 0,4 > +BAL 15,*+44BRANCH OVER CONSTANTS > +DCAL1(160)BLOCK TYPE CODE > +DCAL1(1)FUNCTION TYPE CODE > (16 "DC" lines removed) > +DCAL2(0) RESERVED@ > +DCB'1000' > +LR1,R5 POINT TO PARAMETER LIST AREA > +MVC 16(40,1),0(15) MOVE ACES TO AREA No SIIS. Some of the 40 bytes after the BAL 15,*+44 are likely to be in both kinds of cache line after MVC executes. Still no problem - again as long as nobody is storing into any part of the data that lives in the cache line. Oh - and after typing all that, here's the quote I've been looking for from the IBM Z / LinuxONE System Processor Optimization Primer that was mentioned here a few months ago: "No performance concern is expected with read-only copies of the same cache line
Re: Avoiding SIIS - (Was Base-less macros)
Actually the inline TR/EX will do the TR the first time for ONE byte, not 256, followed by the EX of the specified length. Keith Moe BMC Software On Thursday, November 11, 2021, 01:44:07 PM PST, Seymour J Metz wrote: There are bigger problems than cache in that example; the EX/TR will translate twice, the first time with a length of 256. -- Shmuel (Seymour J.) Metz http://mason.gmu.edu/~smetz3 From: IBM Mainframe Assembler List [ASSEMBLER-LIST@LISTSERV.UGA.EDU] on behalf of Tony Harminc [t...@harminc.com] Sent: Wednesday, November 10, 2021 11:53 PM To: ASSEMBLER-LIST@LISTSERV.UGA.EDU Subject: Re: Avoiding SIIS - (Was Base-less macros) On Wed, 10 Nov 2021 at 11:45, Wendell Lovewell <09624390d784-dmarc-requ...@listserv.uga.edu> wrote: > > I'm reluctant to admit this, but I'm still unclear about SIIS issues. Could > someone please explain what happens to the D- and I-cache in these situations? I can tell you my understanding, but it's certainly not definitive. I just read the books that the people who really know write. Most probably none of these examples does a SIIS, but there isn't enough information to be 100% sure. I will assume that none of R14 in examples 1 and 2, or R5 in example 3, points anywhere near the code you are showing. I will also assume that a cache line is 256 bytes, as it is on all recent machines. (You can use the unprivileged ECAG instruction to ask the machine what the I- and D- cache line sizes are. But of course you can't take that run-time information and turn it back into input to the assembler. A Just In Time compiler, e.g. for Java could do and probably does do that.) > Example 1: > TxtTRNull TR 0(*-*,R14),NoNulls > EX R5,*-6 No SIIS. The fetch of the target of EX/EXRL is defined to be an instruction fetch, so chances are the TR and the EX are in the same I-cache line, in which case that is the only I-cache line in use locally. If the TR happens to end on a cache line boundary, then fetching the EX will bring in another I-cache line. The line containing the TR is unlikely to be discarded, because it was just used. The two operands of TR could involve as many as two D-cache lines each, or as few as one in total, depending on where the operands lie. > Example 2: > PgmConst LOCTR , > TxtTRNull TR 0(*-*,R14),NoNulls > PgmCode LOCTR , > EX R5,TxtTRNull No SIIS. Almost the same as example 1, but the TR and EX are more likely to be in different cache lines because they're more likely to be further apart. You don't show it, but if the PgmConst area contains data as well as the TR instruction, then referencing that data will bring it into a D-cache line. It's not wrong to have this situation, and any performance hit should come only from the fact of having the same bytes in two cache lines, and therefore excluding some other info. I don't believe there is any direct interaction as long as nobody stores into either area. But note that that includes any code that stores into any part of the cache line, and that in turn includes code that may not be executed in a given case but that has been fetched and analysed as part of branch prediction. > Example 3: > GENCB BLK=ACB,AM=VSAM,MACRF=(KEY,DIR,SEQ,IN), > LOC=ANY,RMODE31=ALL, > MF=(G,(R5),GENCBLN) > +GENCBLN EQU 56 LENGTH OF PARM LIST AREA USED > + CNOP 0,4 > + BAL 15,*+44 BRANCH OVER CONSTANTS > + DC AL1(160) BLOCK TYPE CODE > + DC AL1(1) FUNCTION TYPE CODE > (16 "DC" lines removed) > + DC AL2(0) RESERVED @ > + DC B'1000' > + LR 1,R5 POINT TO PARAMETER LIST AREA > + MVC 16(40,1),0(15) MOVE ACES TO AREA No SIIS. Some of the 40 bytes after the BAL 15,*+44 are likely to be in both kinds of cache line after MVC executes. Still no problem - again as long as nobody is storing into any part of the data that lives in the cache line. Oh - and after typing all that, here's the quote I've been looking for from the IBM Z / LinuxONE System Processor Optimization Primer that was mentioned here a few months ago: "No performance concern is expected with read-only copies of the same cache line in both the instruction and data caches. The SIIS inefficiency occurs when the processor detects the same line is in both the instruction and data caches and the data cache's copy is potentially to be updated (including any conditional paths not expected to be executed), at which point an expensive cache synchronization action is needed. So long as both copies of the line in the instruction and data caches remain identical, the synchronization action does not oc
Re: Avoiding SIIS - (Was Base-less macros)
There are bigger problems than cache in that example; the EX/TR will translate twice, the first time with a length of 256. -- Shmuel (Seymour J.) Metz http://mason.gmu.edu/~smetz3 From: IBM Mainframe Assembler List [ASSEMBLER-LIST@LISTSERV.UGA.EDU] on behalf of Tony Harminc [t...@harminc.com] Sent: Wednesday, November 10, 2021 11:53 PM To: ASSEMBLER-LIST@LISTSERV.UGA.EDU Subject: Re: Avoiding SIIS - (Was Base-less macros) On Wed, 10 Nov 2021 at 11:45, Wendell Lovewell <09624390d784-dmarc-requ...@listserv.uga.edu> wrote: > > I'm reluctant to admit this, but I'm still unclear about SIIS issues. Could > someone please explain what happens to the D- and I-cache in these situations? I can tell you my understanding, but it's certainly not definitive. I just read the books that the people who really know write. Most probably none of these examples does a SIIS, but there isn't enough information to be 100% sure. I will assume that none of R14 in examples 1 and 2, or R5 in example 3, points anywhere near the code you are showing. I will also assume that a cache line is 256 bytes, as it is on all recent machines. (You can use the unprivileged ECAG instruction to ask the machine what the I- and D- cache line sizes are. But of course you can't take that run-time information and turn it back into input to the assembler. A Just In Time compiler, e.g. for Java could do and probably does do that.) > Example 1: > TxtTRNull TR0(*-*,R14),NoNulls > EX R5,*-6 No SIIS. The fetch of the target of EX/EXRL is defined to be an instruction fetch, so chances are the TR and the EX are in the same I-cache line, in which case that is the only I-cache line in use locally. If the TR happens to end on a cache line boundary, then fetching the EX will bring in another I-cache line. The line containing the TR is unlikely to be discarded, because it was just used. The two operands of TR could involve as many as two D-cache lines each, or as few as one in total, depending on where the operands lie. > Example 2: > PgmConst LOCTR , > TxtTRNull TR0(*-*,R14),NoNulls > PgmCodeLOCTR , > EX R5,TxtTRNull No SIIS. Almost the same as example 1, but the TR and EX are more likely to be in different cache lines because they're more likely to be further apart. You don't show it, but if the PgmConst area contains data as well as the TR instruction, then referencing that data will bring it into a D-cache line. It's not wrong to have this situation, and any performance hit should come only from the fact of having the same bytes in two cache lines, and therefore excluding some other info. I don't believe there is any direct interaction as long as nobody stores into either area. But note that that includes any code that stores into any part of the cache line, and that in turn includes code that may not be executed in a given case but that has been fetched and analysed as part of branch prediction. > Example 3: > GENCB BLK=ACB,AM=VSAM,MACRF=(KEY,DIR,SEQ,IN), > LOC=ANY,RMODE31=ALL, > MF=(G,(R5),GENCBLN) > +GENCBLN EQU 56LENGTH OF PARM LIST AREA USED > + CNOP 0,4 > + BAL 15,*+44 BRANCH OVER CONSTANTS > + DCAL1(160)BLOCK TYPE CODE > + DCAL1(1)FUNCTION TYPE CODE > (16 "DC" lines removed) > + DCAL2(0) RESERVED@ > + DCB'1000' > + LR1,R5 POINT TO PARAMETER LIST AREA > + MVC 16(40,1),0(15) MOVE ACES TO AREA No SIIS. Some of the 40 bytes after the BAL 15,*+44 are likely to be in both kinds of cache line after MVC executes. Still no problem - again as long as nobody is storing into any part of the data that lives in the cache line. Oh - and after typing all that, here's the quote I've been looking for from the IBM Z / LinuxONE System Processor Optimization Primer that was mentioned here a few months ago: "No performance concern is expected with read-only copies of the same cache line in both the instruction and data caches. The SIIS inefficiency occurs when the processor detects the same line is in both the instruction and data caches and the data cache's copy is potentially to be updated (including any conditional paths not expected to be executed), at which point an expensive cache synchronization action is needed. So long as both copies of the line in the instruction and data caches remain identical, the synchronization action does not occur, and there should be no performance penalty." > (Please forgive the formatting - it's tough to line things up in a > proportional font.) Virtually impossible. I see some of your lines well aligned; others not so much. But it's not uncomfortable to read, so thanks for taking the trouble. Tony H.
Re: Avoiding SIIS - (Was Base-less macros)
A CLC/EX or MVC/EX sequence is okay, albeit inefficient, but a TR/EX sequence would be bad news. -- Shmuel (Seymour J.) Metz http://mason.gmu.edu/~smetz3 From: IBM Mainframe Assembler List [ASSEMBLER-LIST@LISTSERV.UGA.EDU] on behalf of Wendell Lovewell [09624390d784-dmarc-requ...@listserv.uga.edu] Sent: Thursday, November 11, 2021 12:18 PM To: ASSEMBLER-LIST@LISTSERV.UGA.EDU Subject: Re: Avoiding SIIS - (Was Base-less macros) Adam & Tony--thanks for the good explanations. >From Tony's explanation concerning moving the TR command to the data area via >LOCTR, can I surmise that it might be counter-productive to do so--that >leaving the TR and EX commands in consecutive memory is more efficient? > Example 2: > PgmConst LOCTR , > TxtTRNull TR0(*-*,R14),NoNulls > PgmCodeLOCTR , > EX R5,TxtTRNull In case anyone else is interested... https://www.ibm.com/support/pages/system/files/inline-files/siis_coding_examples_v2.pdf contains these guidelines: Contains some examples and guidelines: The following are general coding guidelines which should be followed to write well-formed Assembler code. • Don’t mix or interleave in instructions / executable code any data or operands which are the target of store or update operations. • Generate dummy areas of separation in a sufficient length at those locations where instruction and data/operands are adjacent to each other. An Assembler statement of the type: Label DC XL256’00’ can be used to brute force this objective. Of course, there are smarter ways to separate Instruction cache lines from data cache lines based on knowledge of the cache line boundaries. • Guarantee the 4k page alignment of any Assembler load module via the BINDER/LINKAGE Editor control statement PAGE: //LKED.SYSIN DD * PAGE DRIVER NAME DRIVER(R) • Where possible write to a re-entrant standard. Thanks again, Wendell
Re: Avoiding SIIS - (Was Base-less macros)
Adam & Tony--thanks for the good explanations. From Tony's explanation concerning moving the TR command to the data area via LOCTR, can I surmise that it might be counter-productive to do so--that leaving the TR and EX commands in consecutive memory is more efficient? > Example 2: > PgmConst LOCTR , > TxtTRNull TR0(*-*,R14),NoNulls > PgmCodeLOCTR , > EX R5,TxtTRNull In case anyone else is interested... https://www.ibm.com/support/pages/system/files/inline-files/siis_coding_examples_v2.pdf contains these guidelines: Contains some examples and guidelines: The following are general coding guidelines which should be followed to write well-formed Assembler code. • Don’t mix or interleave in instructions / executable code any data or operands which are the target of store or update operations. • Generate dummy areas of separation in a sufficient length at those locations where instruction and data/operands are adjacent to each other. An Assembler statement of the type: Label DC XL256’00’ can be used to brute force this objective. Of course, there are smarter ways to separate Instruction cache lines from data cache lines based on knowledge of the cache line boundaries. • Guarantee the 4k page alignment of any Assembler load module via the BINDER/LINKAGE Editor control statement PAGE: //LKED.SYSIN DD * PAGE DRIVER NAME DRIVER(R) • Where possible write to a re-entrant standard. Thanks again, Wendell
Re: Avoiding SIIS - (Was Base-less macros)
If size isn't a problem: "skel64" is csect name: * Round start of data to next 256-byte boundary to separate from ipipe zArchorg skel64+(((*+256)-skel64)/256)*256 LTORG
Re: Avoiding SIIS - (Was Base-less macros)
On Wed, 10 Nov 2021 at 11:45, Wendell Lovewell <09624390d784-dmarc-requ...@listserv.uga.edu> wrote: > > I'm reluctant to admit this, but I'm still unclear about SIIS issues. Could > someone please explain what happens to the D- and I-cache in these situations? I can tell you my understanding, but it's certainly not definitive. I just read the books that the people who really know write. Most probably none of these examples does a SIIS, but there isn't enough information to be 100% sure. I will assume that none of R14 in examples 1 and 2, or R5 in example 3, points anywhere near the code you are showing. I will also assume that a cache line is 256 bytes, as it is on all recent machines. (You can use the unprivileged ECAG instruction to ask the machine what the I- and D- cache line sizes are. But of course you can't take that run-time information and turn it back into input to the assembler. A Just In Time compiler, e.g. for Java could do and probably does do that.) > Example 1: > TxtTRNull TR0(*-*,R14),NoNulls > EX R5,*-6 No SIIS. The fetch of the target of EX/EXRL is defined to be an instruction fetch, so chances are the TR and the EX are in the same I-cache line, in which case that is the only I-cache line in use locally. If the TR happens to end on a cache line boundary, then fetching the EX will bring in another I-cache line. The line containing the TR is unlikely to be discarded, because it was just used. The two operands of TR could involve as many as two D-cache lines each, or as few as one in total, depending on where the operands lie. > Example 2: > PgmConst LOCTR , > TxtTRNull TR0(*-*,R14),NoNulls > PgmCodeLOCTR , > EX R5,TxtTRNull No SIIS. Almost the same as example 1, but the TR and EX are more likely to be in different cache lines because they're more likely to be further apart. You don't show it, but if the PgmConst area contains data as well as the TR instruction, then referencing that data will bring it into a D-cache line. It's not wrong to have this situation, and any performance hit should come only from the fact of having the same bytes in two cache lines, and therefore excluding some other info. I don't believe there is any direct interaction as long as nobody stores into either area. But note that that includes any code that stores into any part of the cache line, and that in turn includes code that may not be executed in a given case but that has been fetched and analysed as part of branch prediction. > Example 3: > GENCB BLK=ACB,AM=VSAM,MACRF=(KEY,DIR,SEQ,IN), > LOC=ANY,RMODE31=ALL, > MF=(G,(R5),GENCBLN) > +GENCBLN EQU 56LENGTH OF PARM LIST AREA USED > + CNOP 0,4 > + BAL 15,*+44 BRANCH OVER CONSTANTS > + DCAL1(160)BLOCK TYPE CODE > + DCAL1(1)FUNCTION TYPE CODE > (16 "DC" lines removed) > + DCAL2(0) RESERVED@ > + DCB'1000' > + LR1,R5 POINT TO PARAMETER LIST AREA > + MVC 16(40,1),0(15) MOVE ACES TO AREA No SIIS. Some of the 40 bytes after the BAL 15,*+44 are likely to be in both kinds of cache line after MVC executes. Still no problem - again as long as nobody is storing into any part of the data that lives in the cache line. Oh - and after typing all that, here's the quote I've been looking for from the IBM Z / LinuxONE System Processor Optimization Primer that was mentioned here a few months ago: "No performance concern is expected with read-only copies of the same cache line in both the instruction and data caches. The SIIS inefficiency occurs when the processor detects the same line is in both the instruction and data caches and the data cache's copy is potentially to be updated (including any conditional paths not expected to be executed), at which point an expensive cache synchronization action is needed. So long as both copies of the line in the instruction and data caches remain identical, the synchronization action does not occur, and there should be no performance penalty." > (Please forgive the formatting - it's tough to line things up in a > proportional font.) Virtually impossible. I see some of your lines well aligned; others not so much. But it's not uncomfortable to read, so thanks for taking the trouble. Tony H.
Re: Avoiding SIIS - (Was Base-less macros)
For current processors, 256 bytes. Some IBMers talking with us on these topics suggested a macro called NEXTCLB, which inserts some space depending on some address arithmetic. IMO, if you use this, you could adjust it later if a future processor requires more. (256 is a constant used inside NEXTCLB). HTH, kind regards Bernd Am 10.11.2021 um 19:05 schrieb Schmitt, Michael: How do you ensure that your storage areas are far enough away from the code to not be in the instruction cache line, when your data isn't in GETMAIN storage? You can mitigate by putting constants in between, but how do you know if that's enough?
Re: Avoiding SIIS - (Was Base-less macros)
How do you ensure that your storage areas are far enough away from the code to not be in the instruction cache line, when your data isn't in GETMAIN storage? You can mitigate by putting constants in between, but how do you know if that's enough?
Re: Avoiding SIIS - (Was Base-less macros)
For the first 2 examples, those are not SIIS violations because no modification of storage takes place. The modification of the instruction happens in the core itself (I once heard it called the "instruction register" where this happens... I can't verify the validity of that statement). In the 3rd example, it depends on where R5 is pointing as to whether or not that's a SIIS violation. If where R5 points to contains instructions "close by", then yes. If it's a data-only area, then it's not a SIIS violation. To be a SIIS violation, you have to modify storage in the same CPU cache line where instructions reside. If the 3rd example didn't use MF= and modified the parmlist generated inline by the macro, then yes, that would be a SIIS violation as you're modifying storage that's intermixed with instructions. == Adam Johanson R Software Engineer adam.johan...@broadcom.com -- This electronic communication and the information and any files transmitted with it, or attached to it, are confidential and are intended solely for the use of the individual or entity to whom it is addressed and may contain information that is confidential, legally privileged, protected by privacy laws, or otherwise restricted from disclosure to anyone else. If you are not the intended recipient or the person responsible for delivering the e-mail to the intended recipient, you are hereby notified that any use, copying, distributing, dissemination, forwarding, printing, or copying of this e-mail is strictly prohibited. If you received this e-mail in error, please return the e-mail to the sender, delete it from your computer, and destroy any printed copy of it.
Avoiding SIIS - (Was Base-less macros)
I'm reluctant to admit this, but I'm still unclear about SIIS issues. Could someone please explain what happens to the D- and I-cache in these situations? Example 1: TxtTRNull TR0(*-*,R14),NoNulls EX R5,*-6 Example 2: PgmConst LOCTR , TxtTRNull TR0(*-*,R14),NoNulls PgmCodeLOCTR , EX R5,TxtTRNull Example 3: GENCB BLK=ACB,AM=VSAM,MACRF=(KEY,DIR,SEQ,IN), LOC=ANY,RMODE31=ALL, MF=(G,(R5),GENCBLN) +GENCBLN EQU 56LENGTH OF PARM LIST AREA USED + CNOP 0,4 + BAL 15,*+44 BRANCH OVER CONSTANTS + DCAL1(160)BLOCK TYPE CODE + DCAL1(1)FUNCTION TYPE CODE (16 "DC" lines removed) + DCAL2(0) RESERVED@ + DCB'1000' + LR1,R5 POINT TO PARAMETER LIST AREA + MVC 16(40,1),0(15) MOVE ACES TO AREA (Please forgive the formatting - it's tough to line things up in a proportional font.) Thanks, Wendell