Gary,

The first thought that comes to mind is why not sort the table and use a binary 
search?

If in fact the table is being updated and that is not practical, then your 
options are a serial search (which you have ruled out) or a hash. 

As you may have guessed, SRST and SRSTU are almost certainly milli-coded 
instructions and probably not very fast. 

Secondly, hash performance is highly dependent on the number of synonyms you 
have. Considering how many entries in your table, this is a distinct 
possibility. A full word makes for a large number of entries. The best hash 
tables are of a size of a Mersenne Prime. Right now you are not using one. 

Lastly, obtain storage for your table in a one Mb page frame to minimize TLB 
misses. 

Summary:

1 - If table is static, sort and use a binary search.

2 - Otherwise, create a hash table of a Mersenne Prime of size large enough to 
avoid most synonyms. 

Tom Harper

Phoenix Software International 

Sent from my iPhone

> On Feb 22, 2022, at 5:09 PM, Gary Weinhold <weinh...@dkl.com> wrote:
> 
> We are trying to optimize a search routine for keys in fixed length rows in 
> an unordered array.  As the number of rows in the array grows, a serial 
> search becomes relatively inefficient, so we looked for another technique.  
> We tried a SEARCH STRING (SRST) against a one byte hash of the key to see if 
> it could give us better performance.  The relative positive of the matching 
> byte in the SRST array was used to determine the location of the key in the 
> original array; if the keys match, the row is found; if they don't match, we 
> redrive the SRST.  At about 50 rows, SRST is more efficient than a serial 
> search so it justifies maintaining the hash array.
>  On the average, we assume the SRST would have to be redriven about (n/256)/2 
> times, where n is the number of rows in the array.  This would not be a big 
> factor for several thousand rows, but as the number of rows went into the 
> tens of thousands, we tried Search String Unicode (SRSTU).  It appears to be 
> identical to SRST, except it compares 2-byte values (at 2-byte boundaries). 
> So we created a 2-byte hash and, using the same technique based on relative 
> position, tested for performance improvements compared to SRST when the 
> number of rows exceeded 10000.  We thought that the reduction in the number 
> of redrives due to non-matching keys (on average,  (n/65536)/2) would more 
> than offset the hash array doubling in size.
> 
> Our preliminary results show SRSTU about taking about 50-60% more time for 
> 15000 and 25000 rows.  That came as a surprise to us.  We will do more 
> testing.
> 
> Is there a possibility we are encountering a hardware vs. microcode 
> implementation of the instuctions?  Has anyone else tested the performance of 
> these instructions?
> 
> Regards, Gary
> 
> Gary Weinhold
> Senior Application Architect
> We are trying to optimize a search routine for keys in fixed length rows in 
> an unordered array.  As the number of rows in the array grows, a serial 
> search becomes relatively inefficient, so we looked for another technique.  
> We tried a SEARCH STRING (SRST) against a one byte hash of the key to see if 
> it could give us better performance.  The relative positive of the matching 
> byte in the SRST array was used to determine the location of the key in the 
> original array; if the keys match, the row is found; if they don't match, we 
> redrive the SRST.  At about 50 rows, SRST is more efficient than a serial 
> search so it justifies maintaining the hash array.
>  On the average, we assume the SRST would have to be redriven about (n/256)/2 
> times, where n is the number of rows in the array.  This would not be a big 
> factor for several thousand rows, but as the number of rows went into the 
> tens of thousands, we tried Search String Unicode (SRSTU).  It appears to be 
> identical to SRST, except it compares 2-byte values (at 2-byte boundaries). 
> So we created a 2-byte hash and, using the same technique based on relative 
> position, tested for performance improvements compared to SRST when the 
> number of rows exceeded 10000.  We thought that the reduction in the number 
> of redrives due to non-matching keys (on average,  (n/65536)/2) would more 
> than offset the hash array doubling in size.
> 
> Our preliminary results show SRSTU about taking about 50-60% more time for 
> 15000 and 25000 rows.  That came as a surprise to us.  We will do more 
> testing.
> 
> Is there a possibility we are encountering a hardware vs. microcode 
> implementation of the instuctions?  Has anyone else tested the performance of 
> these instructions?
> 
> Regards, Gary
> 
> Gary Weinhold
> Senior Application Architect
> DATAKINETICS | Data Performance & Optimization
> Phone:+1.613.523.5500 x216
> Email: weinh...@dkl.com
> Visit us online at www.DKL.com
> E-mail Notification: The information contained in this email and any 
> attachments is confidential and may be subject to copyright or other 
> intellectual property protection. If you are not the intended recipient, you 
> are not authorized to use or disclose this information, and we request that 
> you notify us by reply mail or telephone and delete the original message from 
> your mail system. 


--------------------------------------------------------------------------------
This e-mail message, including any attachments, appended messages and the
information contained therein, is for the sole use of the intended
recipient(s). If you are not an intended recipient or have otherwise
received this email message in error, any use, dissemination, distribution,
review, storage or copying of this e-mail message and the information
contained therein is strictly prohibited. If you are not an intended
recipient, please contact the sender by reply e-mail and destroy all copies
of this email message and do not otherwise utilize or retain this email
message or any or all of the information contained therein. Although this
email message and any attachments or appended messages are believed to be
free of any virus or other defect that might affect any computer system into
which it is received and opened, it is the responsibility of the recipient
to ensure that it is virus free and no responsibility is accepted by the
sender for any loss or damage arising in any way from its opening or use.

Reply via email to