David Rowley <david.row...@2ndquadrant.com> writes: > On Sun, 3 Mar 2019 at 05:25, Tom Lane <t...@sss.pgh.pa.us> wrote: >> My initial thought about plugging that admittedly-academic point is >> to insist that the join operator be both strict and a member of a >> btree opclass (hash might be OK too; less sure about other index types).
> Why strict? If both inputs are non-NULL, then what additional > guarantees does strict give us? Yeah, if we're verifying the inputs are non-null, I think that probably doesn't matter. > I implemented a btree opfamily check in my version of the patch. Not > so sure about hash, can you point me in the direction of a mention of > how this is guarantees for btree? https://www.postgresql.org/docs/devel/btree-support-funcs.html quoth The comparison function must take two non-null values A and B and return an int32 value that is < 0, 0, or > 0 when A < B, A = B, or A > B, respectively. A null result is disallowed: all values of the data type must be comparable. (At the code level, this is implicit in the fact that the comparison function will be called via FunctionCall2Coll or a sibling, and those all throw an error if the called function returns NULL.) Now, it doesn't say in so many words that the comparison operators have to yield results consistent with the comparison support function, but I think that's pretty obvious ... For hash, the equivalent constraint is that the hash function has to work for every possible input value. I suppose it's possible that the associated equality operator would sometimes return null, but it's hard to think of a practical reason for doing so. I've not dug in the code, but I wouldn't be too surprised if nodeMergejoin.c or nodeHashjoin.c, or the stuff related to hash grouping or hash aggregation, also contain assumptions about the equality operators not returning null. > The list of builtin types that have a hash opfamily but no btree > opfamily that support NOT IN are not very exciting, so doing the same > for hash might not be worth the extra code. Agreed for builtin types, but there might be some extensions out there where this doesn't hold. It's not terribly hard to imagine a data type that hasn't got a linear sort order but is amenable to hashing. (The in-core xid type is an example, actually.) regards, tom lane