I was going to say all of these same things ... plus Consider two tables, one of "stable keys" and one of "new keys." Sort and binary search the stable list; do a linear search on the additions.
You can pseudo-delete a key from the stable table by turning on a "deleted" flag. You might also consider a B-Tree that you update dynamically. For either of these possibilities you need to consider frequency of update versus frequency of lookup. Charles -----Original Message----- From: IBM Mainframe Assembler List [mailto:ASSEMBLER-LIST@LISTSERV.UGA.EDU] On Behalf Of Tom Harper Sent: Tuesday, February 22, 2022 2:45 PM To: ASSEMBLER-LIST@LISTSERV.UGA.EDU Subject: Re: SRST vs. SRSTU Gary, The first thought that comes to mind is why not sort the table and use a binary search? If in fact the table is being updated and that is not practical, then your options are a serial search (which you have ruled out) or a hash. As you may have guessed, SRST and SRSTU are almost certainly milli-coded instructions and probably not very fast. Secondly, hash performance is highly dependent on the number of synonyms you have. Considering how many entries in your table, this is a distinct possibility. A full word makes for a large number of entries. The best hash tables are of a size of a Mersenne Prime. Right now you are not using one. Lastly, obtain storage for your table in a one Mb page frame to minimize TLB misses. Summary: 1 - If table is static, sort and use a binary search. 2 - Otherwise, create a hash table of a Mersenne Prime of size large enough to avoid most synonyms. Tom Harper Phoenix Software International Sent from my iPhone > On Feb 22, 2022, at 5:09 PM, Gary Weinhold <weinh...@dkl.com> wrote: > > We are trying to optimize a search routine for keys in fixed length rows in an unordered array. As the number of rows in the array grows, a serial search becomes relatively inefficient, so we looked for another technique. We tried a SEARCH STRING (SRST) against a one byte hash of the key to see if it could give us better performance. The relative positive of the matching byte in the SRST array was used to determine the location of the key in the original array; if the keys match, the row is found; if they don't match, we redrive the SRST. At about 50 rows, SRST is more efficient than a serial search so it justifies maintaining the hash array. > On the average, we assume the SRST would have to be redriven about (n/256)/2 times, where n is the number of rows in the array. This would not be a big factor for several thousand rows, but as the number of rows went into the tens of thousands, we tried Search String Unicode (SRSTU). It appears to be identical to SRST, except it compares 2-byte values (at 2-byte boundaries). So we created a 2-byte hash and, using the same technique based on relative position, tested for performance improvements compared to SRST when the number of rows exceeded 10000. We thought that the reduction in the number of redrives due to non-matching keys (on average, (n/65536)/2) would more than offset the hash array doubling in size. > > Our preliminary results show SRSTU about taking about 50-60% more time for 15000 and 25000 rows. That came as a surprise to us. We will do more testing. > > Is there a possibility we are encountering a hardware vs. microcode implementation of the instuctions? Has anyone else tested the performance of these instructions? > > Regards, Gary > > Gary Weinhold > Senior Application Architect > We are trying to optimize a search routine for keys in fixed length rows in an unordered array. As the number of rows in the array grows, a serial search becomes relatively inefficient, so we looked for another technique. We tried a SEARCH STRING (SRST) against a one byte hash of the key to see if it could give us better performance. The relative positive of the matching byte in the SRST array was used to determine the location of the key in the original array; if the keys match, the row is found; if they don't match, we redrive the SRST. At about 50 rows, SRST is more efficient than a serial search so it justifies maintaining the hash array. > On the average, we assume the SRST would have to be redriven about (n/256)/2 times, where n is the number of rows in the array. This would not be a big factor for several thousand rows, but as the number of rows went into the tens of thousands, we tried Search String Unicode (SRSTU). It appears to be identical to SRST, except it compares 2-byte values (at 2-byte boundaries). So we created a 2-byte hash and, using the same technique based on relative position, tested for performance improvements compared to SRST when the number of rows exceeded 10000. We thought that the reduction in the number of redrives due to non-matching keys (on average, (n/65536)/2) would more than offset the hash array doubling in size. > > Our preliminary results show SRSTU about taking about 50-60% more time for 15000 and 25000 rows. That came as a surprise to us. We will do more testing. > > Is there a possibility we are encountering a hardware vs. microcode implementation of the instuctions? Has anyone else tested the performance of these instructions? > > Regards, Gary > > Gary Weinhold > Senior Application Architect > DATAKINETICS | Data Performance & Optimization > Phone:+1.613.523.5500 x216 > Email: weinh...@dkl.com > Visit us online at www.DKL.com > E-mail Notification: The information contained in this email and any attachments is confidential and may be subject to copyright or other intellectual property protection. If you are not the intended recipient, you are not authorized to use or disclose this information, and we request that you notify us by reply mail or telephone and delete the original message from your mail system. ---------------------------------------------------------------------------- ---- This e-mail message, including any attachments, appended messages and the information contained therein, is for the sole use of the intended recipient(s). If you are not an intended recipient or have otherwise received this email message in error, any use, dissemination, distribution, review, storage or copying of this e-mail message and the information contained therein is strictly prohibited. If you are not an intended recipient, please contact the sender by reply e-mail and destroy all copies of this email message and do not otherwise utilize or retain this email message or any or all of the information contained therein. Although this email message and any attachments or appended messages are believed to be free of any virus or other defect that might affect any computer system into which it is received and opened, it is the responsibility of the recipient to ensure that it is virus free and no responsibility is accepted by the sender for any loss or damage arising in any way from its opening or use.