As a variant on John's technique, I'll combine fields into a single text
block that I then run through a fast hashing algorithm that returns a
longint.

What good is that longint? It helps in two cases:

* If you're comparing two copies of the same record during an update/sync,
etc., then you can hash the new copy and see if the hash differs from the
stored original. If they match, you can figure there's been no update.
[Subject to availability, limitations apply. See 'hashing' for complete
terms and details.]

* When you don't know if the row is a duplicate or not, hash the incoming
data and see if it matches something else. If it  does not match, you've
got a new row. If it does match one or more rows, you *might* have a
duplicate, but at least you only need to gets a small # of records to find
out.

Regarding hashing, I don't use SHA1, MD5, etc. Because I don't need them
and don't want the overhead. Instead I use some hashing functions from an
old (10+ years ago) tech notes. They still work great and are a good match
for exactly this sort of application.
**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**********************************************************************

Reply via email to