Re: Two string instruction questions
Very late to the game, but could you use one of the new vector string instructions? Possibly a combination of VECTOR FIND ELEMENT EQUAL and VECTOR STRING RANGE COMPARE. I've done nothing with this (yet), so I can't provide any answers or advice. You would have to be z13 or better to open. On 2018-03-14 08:51, Charles Mills wrote: 1. Is there a machine instruction that will find one string within another? That given "Now is the time" and "is" would find the "is" and return a pointer to it? A machine instruction analog of Rexx POS? 2. Searching the PoOp for such an instruction led me to CUSE. It does not seem that CUSE could be used for this - is that correct? If I am reading CUSE correctly, then given "Now is the time", "All is well" and 2 or 3 would return the position of "is". Is my reading correct? What would that be good for? What would be a reasonable real-world use? Thanks, Charles -- M. Ray Mullins Roseville, CA, USA http://www.catherdersoftware.com/ http://www.z390.org/ German is essentially a form of assembly language consisting entirely of far calls heavily accented with throaty guttural sounds. ---ilvi French is essentially German with messed-up pronunciation and spelling. --Robert B Wilson English is essentially French converted to 7-bit ASCII. ---Christophe Pierret [for Alain LaBonté]
Re: Two string instruction questions
Lol that I what I typed. Your email joined the lines. Find "foo [new line] bar" CharlesSent from a mobile; please excuse the brevity. Original message From: Paul Gilmartin <0014e0e4a59b-dmarc-requ...@listserv.uga.edu> Date: 3/15/18 10:18 AM (GMT-08:00) To: ASSEMBLER-LIST@LISTSERV.UGA.EDU Subject: Re: Two string instruction questions On 2018-03-15, at 11:02:17, Charles Mills wrote: > Your points are good but FWIW ; is a command separator and at a "higher >level" than quoted string parsing. > Find "foo;bar" is for better or worse exactly the same as > Find "foobar" > I believe not. Find "foo;bar" is the same as Find "foo bar" ... both of which are reported as syntax errors. Very bad design. A "higher level" executive has no business performing a bottom-up parse of commands it passes to a lower level processor. At least it should support an escape convention, such as: Find "foo\;bar" ... with the "\" protecting the ";". The command passed to the lower level would then simply be: Find "foo;bar" But that couldn't be done with a simple TRT. Oh, my gosh! Think of the performance implications of doing it right! Feedback from a parser to its lexical analyzer is generally deemed harmful (but consider PL/I!) OTOH, the lexical analyzer should not impose limitations on the parser's syntax. > printf( "foo;bar\n" ); /* in C */ > C's lexical analyzer knows enough to recognize that the second ";", but not the first, is a token separator. -- gil
Re: Two string instruction questions
On 2018-03-15, at 11:02:17, Charles Mills wrote: > Your points are good but FWIW ; is a command separator and at a "higher > level" than quoted string parsing. > Find "foo;bar" is for better or worse exactly the same as > Find "foobar" > I believe not. Find "foo;bar" is the same as Find "foo bar" ... both of which are reported as syntax errors. Very bad design. A "higher level" executive has no business performing a bottom-up parse of commands it passes to a lower level processor. At least it should support an escape convention, such as: Find "foo\;bar" ... with the "\" protecting the ";". The command passed to the lower level would then simply be: Find "foo;bar" But that couldn't be done with a simple TRT. Oh, my gosh! Think of the performance implications of doing it right! Feedback from a parser to its lexical analyzer is generally deemed harmful (but consider PL/I!) OTOH, the lexical analyzer should not impose limitations on the parser's syntax. > printf( "foo;bar\n" ); /* in C */ > C's lexical analyzer knows enough to recognize that the second ";", but not the first, is a token separator. -- gil
Re: Two string instruction questions
On 2018-03-15, at 01:21:23, robi...@dodo.com.au wrote: > Use a TR as well. Works wonders. > Maybe > - Original Message - > Sent:Wed, 14 Mar 2018 21:06:36 -0600 > > Ok. Now make it case-insensitive, which is a common convention. That > can be done with Boyer-Moore. > When Boyer-Moore was first announced (in CACM?) about 40 years ago, it fascinated me. I coded an enhanced variant such that majuscule characters in the pattern would match only majuscule characters in the subject; minuscule characters in the pattern would match either case. I thought it might be useful that "Robin" would match either "Robin" or "ROBIN", but not "robin". Don't know how useful it was, but it was fun to code. I don't believe I could have done it with "a TR as well". -- gil
Re: Two string instruction questions
Your points are good but FWIW ; is a command separator and at a "higher level" than quoted string parsing. Find "foo;bar" is for better or worse exactly the same as Find "foobar" CharlesSent from a mobile; please excuse the brevity. Original message From: Paul Gilmartin <0014e0e4a59b-dmarc-requ...@listserv.uga.edu> Date: 3/15/18 9:45 AM (GMT-08:00) To: ASSEMBLER-LIST@LISTSERV.UGA.EDU Subject: Re: Two string instruction questions On 2018-03-15, at 08:27:57, Charles Mills wrote:. > > 2. TRT is a single op code but that does not make it "fast." > The peculiar evil of TRT is that "Everything looks like a nail." You need only hammer on it enough with TRT. In ISPF, the command: FIND "foo;bar" ... fails for "Unterminated delimited string". But: SAY "foo;bar"; /* in Rexx */ echo "foo;bar"; # in POSIX shell printf( "foo;bar\n" ); /* in C */ In the all 4 languages ";" is a command separator. Only ISPF fails to understand that in a quoted string it should not have that special meaning. I suspect that ISPF does a misguided bottom-up search for ";" with TRT and deems it an unconditional command separator. (Yes, I know that I can choose to sacrifice some other character as a command separator, and avoid that in strings. At times I've used "¾". But why should such mickeymouse be necessary?) (Yes, I know that I can specify my search target in hex. Ugh! It would be better if I could use sporadic hex escapes in otherwise plain text.) -- gil
Re: Two string instruction questions
On 2018-03-15, at 08:27:57, Charles Mills wrote:. > > 2. TRT is a single op code but that does not make it "fast." > The peculiar evil of TRT is that "Everything looks like a nail." You need only hammer on it enough with TRT. In ISPF, the command: FIND "foo;bar" ... fails for "Unterminated delimited string". But: SAY "foo;bar"; /* in Rexx */ echo "foo;bar"; # in POSIX shell printf( "foo;bar\n" ); /* in C */ In the all 4 languages ";" is a command separator. Only ISPF fails to understand that in a quoted string it should not have that special meaning. I suspect that ISPF does a misguided bottom-up search for ";" with TRT and deems it an unconditional command separator. (Yes, I know that I can choose to sacrifice some other character as a command separator, and avoid that in strings. At times I've used "¾". But why should such mickeymouse be necessary?) (Yes, I know that I can specify my search target in hex. Ugh! It would be better if I could use sporadic hex escapes in otherwise plain text.) -- gil
Re: Two string instruction questions
Yeah - mea culpa. It's been years since I looked at that code. We escaped our search strings and then looked for the opening character. You might still do better to do an SRST / CLC loop. On 3/15/18, 11:04 AM, "IBM Mainframe Assembler List on behalf of Charles Mills" <ASSEMBLER-LIST@LISTSERV.UGA.EDU on behalf of charl...@mcn.org> wrote: Nothing, so far as I know. :-) But it finds only a single character and not a substring in a string; a needle in a haystack, to use the Rexx idiom. For finding a single character in a string, SRST is almost undoubtedly faster than TRT. For finding either of two characters, hard to say, but I am going to guess SRST is still faster. Charles -Original Message- From: IBM Mainframe Assembler List [mailto:ASSEMBLER-LIST@LISTSERV.UGA.EDU] On Behalf Of Alan Atkinson Sent: Thursday, March 15, 2018 7:32 AM To: ASSEMBLER-LIST@LISTSERV.UGA.EDU Subject: Re: Two string instruction questions Am I missing something? What's wrong with SRST? On 3/15/18, 10:28 AM, "IBM Mainframe Assembler List on behalf of Charles Mills" <ASSEMBLER-LIST@LISTSERV.UGA.EDU on behalf of charl...@mcn.org> wrote: Traditional, but not anywhere near optimally quick. 1. Read the Wikipedia article @Gil linked to. There are faster ways than the obvious way. 2. TRT is a single op code but that does not make it "fast." Picture an assembler loop to do what TRT does. Not very fast, right? Now picture the same loop in millicode. Still not very fast. That's TRT. Millicode is not magic; it's more or less machine code running under the covers. SRST is probably considerably faster (although still not magically fast, and not case-insensitive as TRT potentially is). On the other hand, "it depends"; and cache misses are what is slow -- instructions themselves hardly matter. Charles -Original Message- From: IBM Mainframe Assembler List [mailto:ASSEMBLER-LIST@LISTSERV.UGA.EDU] On Behalf Of Robin Vowels Sent: Wednesday, March 14, 2018 7:32 PM To: ASSEMBLER-LIST@LISTSERV.UGA.EDU Subject: Re: Two string instruction questions From: "Charles Mills" <charl...@mcn.org> Sent: Thursday, March 15, 2018 2:51 AM > 1. Is there a machine instruction that will find one string within > another? That given "Now is the time" and "is" would find the "is" and > return a pointer to it? A machine instruction analog of Rexx POS? > > 2. Searching the PoOp for such an instruction led me to CUSE. It does > not seem that CUSE could be used for this - is that correct? If I am > reading CUSE correctly, then given "Now is the time", "All is well" > and 2 or 3 would return the position of "is". Is my reading correct? > What would that be good for? What would be a reasonable real-world use? The traditional TRT to search for the first letter, followed by CLC for the word (for which the search can commence at the second letter, because the first letter has already been found), will likely be the quickest. --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus
Re: Two string instruction questions
Nothing, so far as I know. :-) But it finds only a single character and not a substring in a string; a needle in a haystack, to use the Rexx idiom. For finding a single character in a string, SRST is almost undoubtedly faster than TRT. For finding either of two characters, hard to say, but I am going to guess SRST is still faster. Charles -Original Message- From: IBM Mainframe Assembler List [mailto:ASSEMBLER-LIST@LISTSERV.UGA.EDU] On Behalf Of Alan Atkinson Sent: Thursday, March 15, 2018 7:32 AM To: ASSEMBLER-LIST@LISTSERV.UGA.EDU Subject: Re: Two string instruction questions Am I missing something? What's wrong with SRST? On 3/15/18, 10:28 AM, "IBM Mainframe Assembler List on behalf of Charles Mills" <ASSEMBLER-LIST@LISTSERV.UGA.EDU on behalf of charl...@mcn.org> wrote: Traditional, but not anywhere near optimally quick. 1. Read the Wikipedia article @Gil linked to. There are faster ways than the obvious way. 2. TRT is a single op code but that does not make it "fast." Picture an assembler loop to do what TRT does. Not very fast, right? Now picture the same loop in millicode. Still not very fast. That's TRT. Millicode is not magic; it's more or less machine code running under the covers. SRST is probably considerably faster (although still not magically fast, and not case-insensitive as TRT potentially is). On the other hand, "it depends"; and cache misses are what is slow -- instructions themselves hardly matter. Charles -Original Message- From: IBM Mainframe Assembler List [mailto:ASSEMBLER-LIST@LISTSERV.UGA.EDU] On Behalf Of Robin Vowels Sent: Wednesday, March 14, 2018 7:32 PM To: ASSEMBLER-LIST@LISTSERV.UGA.EDU Subject: Re: Two string instruction questions From: "Charles Mills" <charl...@mcn.org> Sent: Thursday, March 15, 2018 2:51 AM > 1. Is there a machine instruction that will find one string within > another? That given "Now is the time" and "is" would find the "is" and > return a pointer to it? A machine instruction analog of Rexx POS? > > 2. Searching the PoOp for such an instruction led me to CUSE. It does > not seem that CUSE could be used for this - is that correct? If I am > reading CUSE correctly, then given "Now is the time", "All is well" > and 2 or 3 would return the position of "is". Is my reading correct? > What would that be good for? What would be a reasonable real-world use? The traditional TRT to search for the first letter, followed by CLC for the word (for which the search can commence at the second letter, because the first letter has already been found), will likely be the quickest. --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus
Re: Two string instruction questions
Am I missing something? What's wrong with SRST? On 3/15/18, 10:28 AM, "IBM Mainframe Assembler List on behalf of Charles Mills" <ASSEMBLER-LIST@LISTSERV.UGA.EDU on behalf of charl...@mcn.org> wrote: Traditional, but not anywhere near optimally quick. 1. Read the Wikipedia article @Gil linked to. There are faster ways than the obvious way. 2. TRT is a single op code but that does not make it "fast." Picture an assembler loop to do what TRT does. Not very fast, right? Now picture the same loop in millicode. Still not very fast. That's TRT. Millicode is not magic; it's more or less machine code running under the covers. SRST is probably considerably faster (although still not magically fast, and not case-insensitive as TRT potentially is). On the other hand, "it depends"; and cache misses are what is slow -- instructions themselves hardly matter. Charles -Original Message- From: IBM Mainframe Assembler List [mailto:ASSEMBLER-LIST@LISTSERV.UGA.EDU] On Behalf Of Robin Vowels Sent: Wednesday, March 14, 2018 7:32 PM To: ASSEMBLER-LIST@LISTSERV.UGA.EDU Subject: Re: Two string instruction questions From: "Charles Mills" <charl...@mcn.org> Sent: Thursday, March 15, 2018 2:51 AM > 1. Is there a machine instruction that will find one string within > another? That given "Now is the time" and "is" would find the "is" and > return a pointer to it? A machine instruction analog of Rexx POS? > > 2. Searching the PoOp for such an instruction led me to CUSE. It does > not seem that CUSE could be used for this - is that correct? If I am > reading CUSE correctly, then given "Now is the time", "All is well" > and 2 or 3 would return the position of "is". Is my reading correct? > What would that be good for? What would be a reasonable real-world use? The traditional TRT to search for the first letter, followed by CLC for the word (for which the search can commence at the second letter, because the first letter has already been found), will likely be the quickest. --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus
Re: Two string instruction questions
Works wonders for slowing things down! TR modifies storage, one of the most expensive acts of all. Charles -Original Message- From: IBM Mainframe Assembler List [mailto:ASSEMBLER-LIST@LISTSERV.UGA.EDU] On Behalf Of robi...@dodo.com.au Sent: Thursday, March 15, 2018 12:21 AM To: ASSEMBLER-LIST@LISTSERV.UGA.EDU Subject: Re: Two string instruction questions Use a TR as well. Works wonders.
Re: Two string instruction questions
Traditional, but not anywhere near optimally quick. 1. Read the Wikipedia article @Gil linked to. There are faster ways than the obvious way. 2. TRT is a single op code but that does not make it "fast." Picture an assembler loop to do what TRT does. Not very fast, right? Now picture the same loop in millicode. Still not very fast. That's TRT. Millicode is not magic; it's more or less machine code running under the covers. SRST is probably considerably faster (although still not magically fast, and not case-insensitive as TRT potentially is). On the other hand, "it depends"; and cache misses are what is slow -- instructions themselves hardly matter. Charles -Original Message- From: IBM Mainframe Assembler List [mailto:ASSEMBLER-LIST@LISTSERV.UGA.EDU] On Behalf Of Robin Vowels Sent: Wednesday, March 14, 2018 7:32 PM To: ASSEMBLER-LIST@LISTSERV.UGA.EDU Subject: Re: Two string instruction questions From: "Charles Mills" <charl...@mcn.org> Sent: Thursday, March 15, 2018 2:51 AM > 1. Is there a machine instruction that will find one string within > another? That given "Now is the time" and "is" would find the "is" and > return a pointer to it? A machine instruction analog of Rexx POS? > > 2. Searching the PoOp for such an instruction led me to CUSE. It does > not seem that CUSE could be used for this - is that correct? If I am > reading CUSE correctly, then given "Now is the time", "All is well" > and 2 or 3 would return the position of "is". Is my reading correct? > What would that be good for? What would be a reasonable real-world use? The traditional TRT to search for the first letter, followed by CLC for the word (for which the search can commence at the second letter, because the first letter has already been found), will likely be the quickest. --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus
Re: Two string instruction questions
Use a TR as well. Works wonders. - Original Message - From: "IBM Mainframe Assembler List"To: Cc: Sent:Wed, 14 Mar 2018 21:06:36 -0600 Subject:Re: Two string instruction questions On 2018-03-14, at 20:32:18, Robin Vowels wrote: > From: "Charles Mills" > Sent: Thursday, March 15, 2018 2:51 AM > >> 1. Is there a machine instruction that will find one string within >> another? That given "Now is the time" and "is" would find the "is" and >> return a pointer to it? A machine instruction analog of Rexx POS? >> 2. Searching the PoOp for such an instruction led me to CUSE. It does >> not seem that CUSE could be used for this - is that correct? If I am reading >> CUSE correctly, then given "Now is the time", "All is well" and 2 or 3 would >> return the position of "is". Is my reading correct? What would that be good >> for? What would be a reasonable real-world use? > > The traditional TRT to search for the first letter, > followed by CLC for the word (for which the search can commence > at the second letter, because the first letter has already been found), > will likely be the quickest. > Ok. Now make it case-insensitive, which is a common convention. That can be done with Boyer-Moore. -- gil
Re: Two string instruction questions
On 2018-03-14, at 20:32:18, Robin Vowels wrote: > From: "Charles Mills"> Sent: Thursday, March 15, 2018 2:51 AM > >> 1. Is there a machine instruction that will find one string within >> another? That given "Now is the time" and "is" would find the "is" and >> return a pointer to it? A machine instruction analog of Rexx POS? >> 2. Searching the PoOp for such an instruction led me to CUSE. It does >> not seem that CUSE could be used for this - is that correct? If I am reading >> CUSE correctly, then given "Now is the time", "All is well" and 2 or 3 would >> return the position of "is". Is my reading correct? What would that be good >> for? What would be a reasonable real-world use? > > The traditional TRT to search for the first letter, > followed by CLC for the word (for which the search can commence > at the second letter, because the first letter has already been found), > will likely be the quickest. > Ok. Now make it case-insensitive, which is a common convention. That can be done with Boyer-Moore. -- gil
Re: Two string instruction questions
Thanks @Jonathan. Charles -Original Message- From: IBM Mainframe Assembler List [mailto:ASSEMBLER-LIST@LISTSERV.UGA.EDU] On Behalf Of Jonathan Scott Sent: Wednesday, March 14, 2018 9:56 AM To: ASSEMBLER-LIST@LISTSERV.UGA.EDU Subject: Re: Two string instruction questions Ref: Your note of 14 March 2018, 08:51:22 -0700 Charles Mills wrote: > 1. Is there a machine instruction that will find one string within > another? That given "Now is the time" and "is" would find the "is" and > return a pointer to it? A machine instruction analog of Rexx POS? I'm not aware of any such instruction. You can use SRST to find a single character, and that can be used as part of a search algorithm. > 2. Searching the PoOp for such an instruction led me to CUSE. It does > not seem that CUSE could be used for this - is that correct? If I am > reading CUSE correctly, then given "Now is the time", "All is well" > and 2 or 3 would return the position of "is". Is my reading correct? > What would that be good for? What would be a reasonable real-world use? CUSE is very useful for creating an efficient representation of the differences between two versions of a record, for example to create a log record showing which bytes changed. It finds the end of a block of changed data, ignoring unchanged sequences which are too small to be worth representing as a separate section. To skip to the end of the unchanged data, CLCL or CLCLE can be used. Jonathan Scott HLASM, IBM Hursley, UK
Re: Two string instruction questions
Hi Peter, Partially right In a VSAM file (KSDS only I think) you can request the Index to have compressed keys Only worth while if the keys were large (max length is 127 bytes) The Index entry would be compared to the previous and both the beginning and ending key bytes would be compared and whatever was the same got a length instead of the bytes, the final index entry looks horrible, but it saved valuable disk space back in the time when dinosaurs (like me) ruled the Earth :-) The RBA would not have been there for a KSDS Have I got the very last "VSAM Logic" Manual on the planet ? Melvyn Maltz - Original Message - From: "Farley, Peter x23353" <peter.far...@broadridge.com> To: <ASSEMBLER-LIST@LISTSERV.UGA.EDU> Sent: Wednesday, March 14, 2018 10:18 PM Subject: Re: Two string instruction questions I think I read somewhere that is what keyed VSAM Index records are, aren't they? A count of equal key bytes and then the remaining non-equal bytes, followed by the RBA in the data component? Or is that a fib I was told? Peter -Original Message- From: IBM Mainframe Assembler List [mailto:ASSEMBLER-LIST@LISTSERV.UGA.EDU] On Behalf Of Charles Mills Sent: Wednesday, March 14, 2018 5:49 PM To: ASSEMBLER-LIST@LISTSERV.UGA.EDU Subject: Re: Two string instruction questions That's interesting. Thanks! I did think of what CUSE would be perfect for: what I know as "vertical compression" but Google does not seem to know the term. Think of standard run length compression as "horizontal." Picture something like that, but where a code indicates "the next 'n' bytes of this record are identical to the bytes in the same position in the previous record." Only works for sequential files, because you need the previous record to decode this records. But works well where there are a lot of repeating fields. -- This message and any attachments are intended only for the use of the addressee and may contain information that is privileged and confidential. If the reader of the message is not the intended recipient or an authorized representative of the intended recipient, you are hereby notified that any dissemination of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by e-mail and delete the message and any attachments from your system.
Re: Two string instruction questions
I think I read somewhere that is what keyed VSAM Index records are, aren't they? A count of equal key bytes and then the remaining non-equal bytes, followed by the RBA in the data component? Or is that a fib I was told? Peter -Original Message- From: IBM Mainframe Assembler List [mailto:ASSEMBLER-LIST@LISTSERV.UGA.EDU] On Behalf Of Charles Mills Sent: Wednesday, March 14, 2018 5:49 PM To: ASSEMBLER-LIST@LISTSERV.UGA.EDU Subject: Re: Two string instruction questions That's interesting. Thanks! I did think of what CUSE would be perfect for: what I know as "vertical compression" but Google does not seem to know the term. Think of standard run length compression as "horizontal." Picture something like that, but where a code indicates "the next 'n' bytes of this record are identical to the bytes in the same position in the previous record." Only works for sequential files, because you need the previous record to decode this records. But works well where there are a lot of repeating fields. -- This message and any attachments are intended only for the use of the addressee and may contain information that is privileged and confidential. If the reader of the message is not the intended recipient or an authorized representative of the intended recipient, you are hereby notified that any dissemination of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by e-mail and delete the message and any attachments from your system.
Re: Two string instruction questions
That's interesting. Thanks! I did think of what CUSE would be perfect for: what I know as "vertical compression" but Google does not seem to know the term. Think of standard run length compression as "horizontal." Picture something like that, but where a code indicates "the next 'n' bytes of this record are identical to the bytes in the same position in the previous record." Only works for sequential files, because you need the previous record to decode this records. But works well where there are a lot of repeating fields. Charles -Original Message- From: IBM Mainframe Assembler List [mailto:ASSEMBLER-LIST@LISTSERV.UGA.EDU] On Behalf Of Paul Gilmartin Sent: Wednesday, March 14, 2018 10:56 AM To: ASSEMBLER-LIST@LISTSERV.UGA.EDU Subject: Re: Two string instruction questions On 2018-03-14, at 09:51:22, Charles Mills wrote: > 1. Is there a machine instruction that will find one string within > another? That given "Now is the time" and "is" would find the "is" and > return a pointer to it? A machine instruction analog of Rexx POS? > > 2. Searching the PoOp for such an instruction led me to CUSE. It does > not seem that CUSE could be used for this - is that correct? If I am > reading CUSE correctly, then given "Now is the time", "All is well" > and 2 or 3 would return the position of "is". Is my reading correct? > What would that be good for? What would be a reasonable real-world use? > BTW, the classic technique is Boyer-Moore. But see: https://en.wikipedia.org/wiki/String_searching_algorithm This might easily be done in microcode. And I once coded a case-insensitive variant of Boyer-Moore. -- gil
Re: Two string instruction questions
On 2018-03-14, at 09:51:22, Charles Mills wrote: > 1. Is there a machine instruction that will find one string within > another? That given "Now is the time" and "is" would find the "is" and > return a pointer to it? A machine instruction analog of Rexx POS? > > 2. Searching the PoOp for such an instruction led me to CUSE. It does > not seem that CUSE could be used for this - is that correct? If I am reading > CUSE correctly, then given "Now is the time", "All is well" and 2 or 3 would > return the position of "is". Is my reading correct? What would that be good > for? What would be a reasonable real-world use? > BTW, the classic technique is Boyer-Moore. But see: https://en.wikipedia.org/wiki/String_searching_algorithm This might easily be done in microcode. And I once coded a case-insensitive variant of Boyer-Moore. -- gil
Re: Two string instruction questions
Ref: Your note of 14 March 2018, 08:51:22 -0700 Charles Mills wrote: > 1. Is there a machine instruction that will find one string within > another? That given "Now is the time" and "is" would find the "is" and > return a pointer to it? A machine instruction analog of Rexx POS? I'm not aware of any such instruction. You can use SRST to find a single character, and that can be used as part of a search algorithm. > 2. Searching the PoOp for such an instruction led me to CUSE. It does > not seem that CUSE could be used for this - is that correct? If I am reading > CUSE correctly, then given "Now is the time", "All is well" and 2 or 3 would > return the position of "is". Is my reading correct? What would that be good > for? What would be a reasonable real-world use? CUSE is very useful for creating an efficient representation of the differences between two versions of a record, for example to create a log record showing which bytes changed. It finds the end of a block of changed data, ignoring unchanged sequences which are too small to be worth representing as a separate section. To skip to the end of the unchanged data, CLCL or CLCLE can be used. Jonathan Scott HLASM, IBM Hursley, UK
Re: Two string instruction questions
On 14 March 2018 at 11:51, Charles Millswrote: > 1. Is there a machine instruction that will find one string within > another? That given "Now is the time" and "is" would find the "is" and > return a pointer to it? A machine instruction analog of Rexx POS? I am almost certain that there is not. > 2. Searching the PoOp for such an instruction led me to CUSE. It does > not seem that CUSE could be used for this - is that correct? If I am reading > CUSE correctly, then given "Now is the time", "All is well" and 2 or 3 would > return the position of "is". Is my reading correct? What would that be good > for? What would be a reasonable real-world use? The usefulness or otherwise of CUSE has been discussed here a couple of times over the years. IIRC DB2 (sorry, Db2) rows were mentioned. Certainly CUSE by itself is not a Rexx-style POS. I found the description in the POP unclear, and so wrote and tested some little programs to see what it does. It finds a substring match *at the same offset* in both strings, so it cannot do what is needed for POS. To implement POS I ended up using SRST to find the first character, followed by CLCL for the rest, and it turns out that the register setup is not too bad for this use if you choose carefully. I couldn't see a useful way to exploit CUSE after SRST for this, but it's quite possible I missed the trick. And it's certainly possible that an old-school approach using CLC in a loop might be faster than my two step process. To do POS my way you have to remember a couple of (obvious but easy to forget) things: - if your SRST finds the first character but the CLCL doesn't find the whole string, then you have to loop back and try for the first character again. For example if you searched a string "Now is the time it seems" for "it". - you have to keep track of the length remaining in the string you are searching in when you do the CLCL. Unlike SRST, which uses an end-pointer that should stay valid, CLCL needs a current length for each operand. Somehow I wanted CLCL's padding to be useful here, but it isn't. And a more subtle one: - SRST does not change the pointer to the string (the second register) unless you get CC3. For this instruction CC3 can occur only if the string is > 256 bytes. So if you know in advance that this will be the case, you may be able to avoid having to fix up that register if you are going to need it to calculate a length for the CLCL. OTOH you may be setting up a code maintainer for a nightmare if you assume this and conditions change later. It's also almost impossible to keep track in one's head of which of the "CPU determined number of bytes" instructions guaranty at least how many bytes. Some are 256, some are 8, and many are 1. I'm not a Compiler Guy, but I am fascinated that this kind of subtlety can be encoded in instruction description tables that compiler code generation can use. Tony H.
Re: Two string instruction questions
I don't recall seeing "at same offset" in the PoOps. I'll double check and, if appropriate, send an RCF. -- Shmuel (Seymour J.) Metz http://mason.gmu.edu/~smetz3 From: IBM Mainframe Assembler List <ASSEMBLER-LIST@listserv.uga.edu> on behalf of Charles Mills <charl...@mcn.org> Sent: Wednesday, March 14, 2018 12:33 PM To: ASSEMBLER-LIST@listserv.uga.edu Subject: Re: Two string instruction questions I don't read it that way but I am less than certain of my interpretation. Some CUSE examples in Appendix A would be nice, eh? I tried Googling for that and got this which seems to support my interpretation of CUSE: http://secure-web.cisco.com/1WeuX5IJkM54cWmI1-Adusi3_Nq8j51SDdDLcRonfztzRjOX28BXK1MAPxpEsFAMZIDKNnI4j3eAapp7ZCXFFLowlGR4QELy_XfzSMfnlTu91jY4TregKNiwgho-lesmjaQqWG3f-10JrcjjEhUy4sgyaunS6nf0_sPzuA1WLqpSUejPNNOTm05OLGFLvF3EViDTtGBGZRzeZS8Hf-JJ9xKB5dlcgZZ3HoYOwGqpYufVEVEeZQNPKp4VnqEV46JslqlGRaYcoY7uCpeEaUjexO-rNwAIjyzYyYqJGm-Wsm68PaokfRTsCL0b-AqME36O1RMFmzC9VltFF3hEQh7u4Ynbiyl_wX5py-E07slzlwfFjp94Ij-l0UxD8izoO7Q6VIOfS6NpOSSJKIXBSTlEV9yQ73f6Et3Qn2_rc69Gqa0cNTpcfQizBt2kFl2inBN7B/http%3A%2F%2Fibmmainframes.com%2Fabout23525.html It also shares my "well then, what IS CUSE good for?" FWIW, both of my stings are anticipated to be < 256 bytes. Charles -Original Message- From: IBM Mainframe Assembler List [mailto:ASSEMBLER-LIST@LISTSERV.UGA.EDU] On Behalf Of Seymour J Metz Sent: Wednesday, March 14, 2018 9:22 AM To: ASSEMBLER-LIST@LISTSERV.UGA.EDU Subject: Re: Two string instruction questions If your search string is less than 256 bytes then CUSE should work, if I am reading th PoOps correctly. Set R0 to the length of the search string. -- Shmuel (Seymour J.) Metz http://mason.gmu.edu/~smetz3 From: IBM Mainframe Assembler List <ASSEMBLER-LIST@listserv.uga.edu> on behalf of Charles Mills <charl...@mcn.org> Sent: Wednesday, March 14, 2018 11:51 AM To: ASSEMBLER-LIST@listserv.uga.edu Subject: Two string instruction questions 1. Is there a machine instruction that will find one string within another? That given "Now is the time" and "is" would find the "is" and return a pointer to it? A machine instruction analog of Rexx POS? 2. Searching the PoOp for such an instruction led me to CUSE. It does not seem that CUSE could be used for this - is that correct? If I am reading CUSE correctly, then given "Now is the time", "All is well" and 2 or 3 would return the position of "is". Is my reading correct? What would that be good for? What would be a reasonable real-world use?
Re: Two string instruction questions
According to John Ehrman's "Assembler Language Programming for IBM System zT Servers Version 2.00," the CUSE instruction searches only for matches at the same offset. In the case you describe, it would not find a match unless the second string was "is" so that the word you are looking for is at offset 4 in both strings. > -Original Message- > From: IBM Mainframe Assembler List> On Behalf Of Charles Mills > Sent: Wednesday, March 14, 2018 8:51 AM > To: ASSEMBLER-LIST@LISTSERV.UGA.EDU > Subject: Two string instruction questions > > 1. Is there a machine instruction that will find one string within > another? That given "Now is the time" and "is" would find the "is" and > return a pointer to it? A machine instruction analog of Rexx POS? > > 2. Searching the PoOp for such an instruction led me to CUSE. It does > not seem that CUSE could be used for this - is that correct? If I am reading > CUSE correctly, then given "Now is the time", "All is well" and 2 or 3 would > return the position of "is". Is my reading correct? What would that be good > for? What would be a reasonable real-world use?
Re: Two string instruction questions
I don't read it that way but I am less than certain of my interpretation. Some CUSE examples in Appendix A would be nice, eh? I tried Googling for that and got this which seems to support my interpretation of CUSE: http://ibmmainframes.com/about23525.html It also shares my "well then, what IS CUSE good for?" FWIW, both of my stings are anticipated to be < 256 bytes. Charles -Original Message- From: IBM Mainframe Assembler List [mailto:ASSEMBLER-LIST@LISTSERV.UGA.EDU] On Behalf Of Seymour J Metz Sent: Wednesday, March 14, 2018 9:22 AM To: ASSEMBLER-LIST@LISTSERV.UGA.EDU Subject: Re: Two string instruction questions If your search string is less than 256 bytes then CUSE should work, if I am reading th PoOps correctly. Set R0 to the length of the search string. -- Shmuel (Seymour J.) Metz http://mason.gmu.edu/~smetz3 From: IBM Mainframe Assembler List <ASSEMBLER-LIST@listserv.uga.edu> on behalf of Charles Mills <charl...@mcn.org> Sent: Wednesday, March 14, 2018 11:51 AM To: ASSEMBLER-LIST@listserv.uga.edu Subject: Two string instruction questions 1. Is there a machine instruction that will find one string within another? That given "Now is the time" and "is" would find the "is" and return a pointer to it? A machine instruction analog of Rexx POS? 2. Searching the PoOp for such an instruction led me to CUSE. It does not seem that CUSE could be used for this - is that correct? If I am reading CUSE correctly, then given "Now is the time", "All is well" and 2 or 3 would return the position of "is". Is my reading correct? What would that be good for? What would be a reasonable real-world use?
Re: Two string instruction questions
It appears if r0 also equals length of second operand, you should get desired result. Richard -Original Message- From: IBM Mainframe Assembler List [mailto:ASSEMBLER-LIST@LISTSERV.UGA.EDU] On Behalf Of Seymour J Metz Sent: Wednesday, March 14, 2018 12:22 PM To: ASSEMBLER-LIST@LISTSERV.UGA.EDU Subject: Re: Two string instruction questions If your search string is less than 256 bytes then CUSE should work, if I am reading th PoOps correctly. Set R0 to the length of the search string. -- Shmuel (Seymour J.) Metz http://mason.gmu.edu/~smetz3 From: IBM Mainframe Assembler List <ASSEMBLER-LIST@listserv.uga.edu> on behalf of Charles Mills <charl...@mcn.org> Sent: Wednesday, March 14, 2018 11:51 AM To: ASSEMBLER-LIST@listserv.uga.edu Subject: Two string instruction questions 1. Is there a machine instruction that will find one string within another? That given "Now is the time" and "is" would find the "is" and return a pointer to it? A machine instruction analog of Rexx POS? 2. Searching the PoOp for such an instruction led me to CUSE. It does not seem that CUSE could be used for this - is that correct? If I am reading CUSE correctly, then given "Now is the time", "All is well" and 2 or 3 would return the position of "is". Is my reading correct? What would that be good for? What would be a reasonable real-world use? Thanks, Charles - The information contained in this communication (including any attachments hereto) is confidential and is intended solely for the personal and confidential use of the individual or entity to whom it is addressed. The information may also constitute a legally privileged confidential communication. If the reader of this message is not the intended recipient or an agent responsible for delivering it to the intended recipient, you are hereby notified that you have received this communication in error and that any review, dissemination, copying, or unauthorized use of this information, or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this communication in error, please notify us immediately by e-mail, and delete the original message. Thank you
Re: Two string instruction questions
If your search string is less than 256 bytes then CUSE should work, if I am reading th PoOps correctly. Set R0 to the length of the search string. -- Shmuel (Seymour J.) Metz http://mason.gmu.edu/~smetz3 From: IBM Mainframe Assembler Liston behalf of Charles Mills Sent: Wednesday, March 14, 2018 11:51 AM To: ASSEMBLER-LIST@listserv.uga.edu Subject: Two string instruction questions 1. Is there a machine instruction that will find one string within another? That given "Now is the time" and "is" would find the "is" and return a pointer to it? A machine instruction analog of Rexx POS? 2. Searching the PoOp for such an instruction led me to CUSE. It does not seem that CUSE could be used for this - is that correct? If I am reading CUSE correctly, then given "Now is the time", "All is well" and 2 or 3 would return the position of "is". Is my reading correct? What would that be good for? What would be a reasonable real-world use? Thanks, Charles