Gary, The first thought that comes to mind is why not sort the table and use a binary search?
If in fact the table is being updated and that is not practical, then your options are a serial search (which you have ruled out) or a hash. As you may have guessed, SRST and SRSTU are almost certainly milli-coded instructions and probably not very fast. Secondly, hash performance is highly dependent on the number of synonyms you have. Considering how many entries in your table, this is a distinct possibility. A full word makes for a large number of entries. The best hash tables are of a size of a Mersenne Prime. Right now you are not using one. Lastly, obtain storage for your table in a one Mb page frame to minimize TLB misses. Summary: 1 - If table is static, sort and use a binary search. 2 - Otherwise, create a hash table of a Mersenne Prime of size large enough to avoid most synonyms. Tom Harper Phoenix Software International Sent from my iPhone > On Feb 22, 2022, at 5:09 PM, Gary Weinhold <weinh...@dkl.com> wrote: > > We are trying to optimize a search routine for keys in fixed length rows in > an unordered array. As the number of rows in the array grows, a serial > search becomes relatively inefficient, so we looked for another technique. > We tried a SEARCH STRING (SRST) against a one byte hash of the key to see if > it could give us better performance. The relative positive of the matching > byte in the SRST array was used to determine the location of the key in the > original array; if the keys match, the row is found; if they don't match, we > redrive the SRST. At about 50 rows, SRST is more efficient than a serial > search so it justifies maintaining the hash array. > On the average, we assume the SRST would have to be redriven about (n/256)/2 > times, where n is the number of rows in the array. This would not be a big > factor for several thousand rows, but as the number of rows went into the > tens of thousands, we tried Search String Unicode (SRSTU). It appears to be > identical to SRST, except it compares 2-byte values (at 2-byte boundaries). > So we created a 2-byte hash and, using the same technique based on relative > position, tested for performance improvements compared to SRST when the > number of rows exceeded 10000. We thought that the reduction in the number > of redrives due to non-matching keys (on average, (n/65536)/2) would more > than offset the hash array doubling in size. > > Our preliminary results show SRSTU about taking about 50-60% more time for > 15000 and 25000 rows. That came as a surprise to us. We will do more > testing. > > Is there a possibility we are encountering a hardware vs. microcode > implementation of the instuctions? Has anyone else tested the performance of > these instructions? > > Regards, Gary > > Gary Weinhold > Senior Application Architect > We are trying to optimize a search routine for keys in fixed length rows in > an unordered array. As the number of rows in the array grows, a serial > search becomes relatively inefficient, so we looked for another technique. > We tried a SEARCH STRING (SRST) against a one byte hash of the key to see if > it could give us better performance. The relative positive of the matching > byte in the SRST array was used to determine the location of the key in the > original array; if the keys match, the row is found; if they don't match, we > redrive the SRST. At about 50 rows, SRST is more efficient than a serial > search so it justifies maintaining the hash array. > On the average, we assume the SRST would have to be redriven about (n/256)/2 > times, where n is the number of rows in the array. This would not be a big > factor for several thousand rows, but as the number of rows went into the > tens of thousands, we tried Search String Unicode (SRSTU). It appears to be > identical to SRST, except it compares 2-byte values (at 2-byte boundaries). > So we created a 2-byte hash and, using the same technique based on relative > position, tested for performance improvements compared to SRST when the > number of rows exceeded 10000. We thought that the reduction in the number > of redrives due to non-matching keys (on average, (n/65536)/2) would more > than offset the hash array doubling in size. > > Our preliminary results show SRSTU about taking about 50-60% more time for > 15000 and 25000 rows. That came as a surprise to us. We will do more > testing. > > Is there a possibility we are encountering a hardware vs. microcode > implementation of the instuctions? Has anyone else tested the performance of > these instructions? > > Regards, Gary > > Gary Weinhold > Senior Application Architect > DATAKINETICS | Data Performance & Optimization > Phone:+1.613.523.5500 x216 > Email: weinh...@dkl.com > Visit us online at www.DKL.com > E-mail Notification: The information contained in this email and any > attachments is confidential and may be subject to copyright or other > intellectual property protection. If you are not the intended recipient, you > are not authorized to use or disclose this information, and we request that > you notify us by reply mail or telephone and delete the original message from > your mail system. -------------------------------------------------------------------------------- This e-mail message, including any attachments, appended messages and the information contained therein, is for the sole use of the intended recipient(s). If you are not an intended recipient or have otherwise received this email message in error, any use, dissemination, distribution, review, storage or copying of this e-mail message and the information contained therein is strictly prohibited. If you are not an intended recipient, please contact the sender by reply e-mail and destroy all copies of this email message and do not otherwise utilize or retain this email message or any or all of the information contained therein. Although this email message and any attachments or appended messages are believed to be free of any virus or other defect that might affect any computer system into which it is received and opened, it is the responsibility of the recipient to ensure that it is virus free and no responsibility is accepted by the sender for any loss or damage arising in any way from its opening or use.