Hi,

While looking at the hashed ScalarArrayOpExpr path, I noticed that the
linear
and hashed evaluation paths can give different answers if the comparison
function returns NULL for non-NULL inputs.

The part I am trying to understand is what assumption the hashed path is
allowed to make about equality operators used for hashing.  Does declaring
an
equality operator as HASHES / putting it in a hash opclass imply that, for
non-NULL inputs, the equality result is a definite true or false value?

Reproduced on master.

Setup:

CREATE FUNCTION weird_strict_int_eq(int, int)
RETURNS bool
LANGUAGE sql IMMUTABLE STRICT
AS $$
  SELECT CASE
           WHEN $1 = 42 AND $2 = 42 THEN NULL
           ELSE $1 = $2
         END
$$;

CREATE FUNCTION weird_strict_int_ne(int, int)
RETURNS bool
LANGUAGE sql IMMUTABLE STRICT
AS $$
  SELECT NOT weird_strict_int_eq($1, $2)
$$;

CREATE OPERATOR === (
  LEFTARG = int,
  RIGHTARG = int,
  PROCEDURE = weird_strict_int_eq,
  COMMUTATOR = ===,
  NEGATOR = !==,
  HASHES
);

CREATE OPERATOR !== (
  LEFTARG = int,
  RIGHTARG = int,
  PROCEDURE = weird_strict_int_ne,
  COMMUTATOR = !==,
  NEGATOR = ===
);

CREATE OPERATOR CLASS weird_strict_int_hash_ops
FOR TYPE int USING hash AS
  OPERATOR 1 ===,
  FUNCTION 1 hashint4(int);

The hashed ScalarArrayOpExpr path activates once the constant array has at
least MIN_ARRAY_SIZE_FOR_HASHED_SAOP (currently 9) elements, so the
8-element
variant uses the linear evaluation and the 9-element variant uses the hash
table. I wrap the left-hand side in a non-immutable function call so the
expression is not constant-folded at plan time.

CREATE FUNCTION return_int_input(int) RETURNS int
LANGUAGE sql STABLE AS $$ SELECT $1 $$;

SELECT
  return_int_input(42) === ANY (ARRAY[1,2,3,4,5,6,7,42])   AS linear_any,
  return_int_input(42) === ANY (ARRAY[1,2,3,4,5,6,7,8,42]) AS hashed_any,
  return_int_input(42) !== ALL (ARRAY[1,2,3,4,5,6,7,42])   AS linear_not_in,
  return_int_input(42) !== ALL (ARRAY[1,2,3,4,5,6,7,8,42]) AS hashed_not_in;

The linear path preserves the NULL (UNKNOWN) result from the comparison.
The
hashed path seems to treat a NULL from the comparison as a non-match,
producing
a different result.

I realize this is a strange equality operator, and that may be the whole
point.
For a hash table lookup, it seems reasonable to need a definite answer to
"does
this stored key match the lookup key?".  On the other hand,
ScalarArrayOpExpr
itself has SQL three-valued semantics, and the linear path does preserve the
comparison's NULL result.

So my question is: is the hashed path allowed to assume that hashable
equality
operators never return NULL for non-NULL inputs, or should it preserve the
same
UNKNOWN result that the linear ScalarArrayOpExpr evaluation would produce?

Regards,
Ayush

Reply via email to