Re: [HACKERS] Hash Functions

Robert Haas Thu, 03 Aug 2017 14:51:15 -0700

On Thu, Aug 3, 2017 at 5:32 PM, Andres Freund <[email protected]> wrote:
>> Do you have any feeling for which of those endianness-independent hash
>> functions might be a reasonable choice for us?
>
> Not a strong / very informed one, TBH.
>
> I'm not convinced it's worth trying to achieve this in the first place,
> now that we "nearly" have the insert-via-parent feature. Isn't this a
> lot of work, for very little practical gain? Having to select that when
> switching architectures still seems unproblematic. People will just
> about never switch endianess, so even a tiny performance & effort
> overhead doesn't seem worth it to me.


I kind of agree with you.  There are some advantages of being
endian-independent, like maybe your hash partitioning is really across
multiple shards, and not all the shards are the same machine
architecture, but it's not going to come up for most people.

For me, the basic point here is that we need a set of hash functions
for hash partitioning that are different than what we use for hash
indexes and hash joins -- otherwise when we hash partition a table and
create hash indexes on each partition, those indexes will have nasty
clustering.  Partitionwise hash joins will have similar problems.  So,
a new set of hash functions specifically for hash partitioning is
quite desirable.

Given that, if it's not a big problem to pick ones that have the
portability properties at least some people want, I'd be inclined to
do it.  If it results in a noticeable slowdown on macrobenchmarks,
then not so much, but otherwise, I'd rather do what people are asking
for than spend time arguing about it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hash Functions

Reply via email to