Re: check for null value before looking up the hash function

David G. Johnston Sat, 21 May 2022 09:05:23 -0700

On Sat, May 21, 2022 at 8:32 AM Ranier Vilela <[email protected]> wrote:


> Em sáb., 21 de mai. de 2022 às 12:05, Tomas Vondra <
> [email protected]> escreveu:
>
>>
>>
>> On 5/21/22 15:06, Ranier Vilela wrote:
>> >>Zhihong Yu <zyu(at)yugabyte(dot)com> writes:
>> >>> I was looking at the code in hash_record()
>> >>> of src/backend/utils/adt/rowtypes.c
>> >>> It seems if nulls[i] is true, we don't need to look up the hash
>> function.
>> >
>> >>I don't think this is worth changing. It complicates the logic,
>> >>rendering it unlike quite a few other functions written in the same
>> >>style. In cases where the performance actually matters, the hash
>> >>function is cached across multiple calls anyway. You might save
>> >>something if you have many calls in a query and not one of them
>> >>receives a non-null input, but how likely is that?
>> >
>> > I disagree.
>> > I think that is worth changing. The fact of complicating the logic
>> > is irrelevant.
>>
>> That's a quite bold claim, and yet you haven't supported it by any
>> argument whatsoever. Trade-offs between complexity and efficiency are a
>> crucial development task, so complicating the logic clearly does matter.
>>
> What I meant is that complicating the logic in search of efficiency is
> worth it, and that's what everyone is looking for in this thread.
> Likewise, not complicating the logic, losing a little bit of efficiency,
> applied to all the code, leads to a big loss of efficiency.
> In other words, I never miss an opportunity to gain efficiency.
>
>
I disliked the fact that 80% of the patch was adding indentation.  Instead,
remove indentation for the normal flow case and move the loop short-circuit
to the top of the loop where it is traditionally found.

This seems like a win on both efficiency and complexity grounds.  Having
the "/* see hash_array() */" comment and logic twice is a downside but a
minor one that could be replaced with a function call if desired.

diff --git a/src/backend/utils/adt/rowtypes.c
b/src/backend/utils/adt/rowtypes.c
index db843a0fbf..0bc28d1742 100644
--- a/src/backend/utils/adt/rowtypes.c
+++ b/src/backend/utils/adt/rowtypes.c
@@ -1838,6 +1838,13 @@ hash_record(PG_FUNCTION_ARGS)
        TypeCacheEntry *typentry;
        uint32      element_hash;

+       if (nulls[i])
+       {
+           /* see hash_array() */
+           result = (result << 5) - result + 0;
+           continue;
+       }
+
        att = TupleDescAttr(tupdesc, i);

        if (att->attisdropped)
@@ -1860,24 +1867,16 @@ hash_record(PG_FUNCTION_ARGS)
            my_extra->columns[i].typentry = typentry;
        }

-       /* Compute hash of element */
-       if (nulls[i])
-       {
-           element_hash = 0;
-       }
-       else
-       {
-           LOCAL_FCINFO(locfcinfo, 1);
+       LOCAL_FCINFO(locfcinfo, 1);

-           InitFunctionCallInfoData(*locfcinfo,
&typentry->hash_proc_finfo, 1,
-                                    att->attcollation, NULL, NULL);
-           locfcinfo->args[0].value = values[i];
-           locfcinfo->args[0].isnull = false;
-           element_hash = DatumGetUInt32(FunctionCallInvoke(locfcinfo));
+       InitFunctionCallInfoData(*locfcinfo, &typentry->hash_proc_finfo, 1,
+                                   att->attcollation, NULL, NULL);
+       locfcinfo->args[0].value = values[i];
+       locfcinfo->args[0].isnull = false;
+       element_hash = DatumGetUInt32(FunctionCallInvoke(locfcinfo));

-           /* We don't expect hash support functions to return null */
-           Assert(!locfcinfo->isnull);
-       }
+       /* We don't expect hash support functions to return null */
+       Assert(!locfcinfo->isnull);

        /* see hash_array() */
        result = (result << 5) - result + element_hash;

Re: check for null value before looking up the hash function

Reply via email to