On Sat, May 13, 2017 at 12:52 AM, Amit Kapila <amit.kapil...@gmail.com> wrote: > Can we think of defining separate portable hash functions which can be > used for the purpose of hash partitioning?
I think that would be a good idea. I think it shouldn't even be that hard. By data type: - Integers. We'd need to make sure that we get the same results for the same value on big-endian and little-endian hardware, and that performance is good on both systems. That seems doable. - Floats. There may be different representations in use on different hardware, which could be a problem. Tom didn't answer my question about whether any even-vaguely-modern hardware is still using non-IEEE floats, which I suspect means that the answer is "no". If every bit of hardware we are likely to find uses basically the same representation of the same float value, then this shouldn't be hard. (Also, even if this turns out to be hard for floats, using a float as a partitioning key would be a surprising choice because the default output representation isn't even unambiguous; you need extra_float_digits for that.) - Strings. There's basically only one representation for a string. If we assume that the hash code only needs to be portable across hardware and not across encodings, a position for which I already argued upthread, then I think this should be manageable. - Everything Else. Basically, everything else is just a composite of that stuff, I think. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers