> You are aware of course that you can't use any hashing function on its own to > detect duplicates? - the best you can do is detect *probable* duplicates,
Actually, if you choose the right hash function you can detect duplicates. If you create a UDF based on/using SHA256, the result would be unique (with a 2^256 certainty) -- there is no known collision of a SHA256 hash (https://en.wikipedia.org/wiki/Hash_function_security_summary). Sean