On 10/13/23 11:21, Dean Rasheed wrote:
> On Thu, 12 Oct 2023 at 23:43, Tomas Vondra
> <tomas.von...@enterprisedb.com> wrote:
>>
>> Ashutosh Bapat reported me off-list a possible issue in how BRIN
>> minmax-multi calculate distance for infinite timestamp/date values.
>>
>> The current code does this:
>>
>>     if (TIMESTAMP_NOT_FINITE(dt1) || TIMESTAMP_NOT_FINITE(dt2))
>>         PG_RETURN_FLOAT8(0);
>>
> 
> Yes indeed, that looks wrong. I noticed the same thing while reviewing
> the infinite interval patch.
> 
>> so means infinite values are "very close" to any other value, and thus
>> likely to be merged into a summary range. That's exactly the opposite of
>> what we want to do, possibly resulting in inefficient indexes.
>>
> 
> Is this only inefficient? Or can it also lead to wrong query results?
> 

I don't think it can produce incorrect results. It only affects which
values we "merge" into an interval when building the summaries.

>> Attached is a patch fixing this
>>
> 
> I wonder if it's actually necessary to give infinity any special
> handling at all for dates and timestamps. For those types, "infinity"
> is actually just INT_MIN/MAX, which compares correctly with any finite
> value, and will be much larger/smaller than any common value, so it
> seems like it isn't necessary to give "infinite" values any special
> treatment. That would be consistent with date_cmp() and
> timestamp_cmp().
> 

Right, but ....

> Something else that looks wrong about that BRIN code is that the
> integer subtraction might lead to overflow -- it is subtracting two
> integer values, then casting the result to float8. It should cast each
> input before subtracting, more like brin_minmax_multi_distance_int8().
> 

... it also needs to fix this, otherwise it overflows. Consider

     delta = dt2 - dt1;

and assume dt1 is INT64_MIN, or that dt2 is INT64_MAX.

> IOW, I think brin_minmax_multi_distance_date/timestamp could be made
> basically identical to brin_minmax_multi_distance_int4/8.
> 

Right. Attached is a patch doing it this way.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/src/backend/access/brin/brin_minmax_multi.c b/src/backend/access/brin/brin_minmax_multi.c
index f8b2a3f9bc6..3d5e6d0c8f6 100644
--- a/src/backend/access/brin/brin_minmax_multi.c
+++ b/src/backend/access/brin/brin_minmax_multi.c
@@ -2084,10 +2084,13 @@ brin_minmax_multi_distance_date(PG_FUNCTION_ARGS)
 	DateADT		dateVal1 = PG_GETARG_DATEADT(0);
 	DateADT		dateVal2 = PG_GETARG_DATEADT(1);
 
-	if (DATE_NOT_FINITE(dateVal1) || DATE_NOT_FINITE(dateVal2))
-		PG_RETURN_FLOAT8(0);
+	/*
+	 * We know the values are range boundaries, but the range may be collapsed
+	 * (i.e. single points), with equal values.
+	 */
+	Assert(dateVal1 <= dateVal2);
 
-	PG_RETURN_FLOAT8(dateVal1 - dateVal2);
+	PG_RETURN_FLOAT8((double) dateVal2 - (double) dateVal1);
 }
 
 /*
@@ -2136,19 +2139,16 @@ brin_minmax_multi_distance_timetz(PG_FUNCTION_ARGS)
 Datum
 brin_minmax_multi_distance_timestamp(PG_FUNCTION_ARGS)
 {
-	float8		delta = 0;
-
 	Timestamp	dt1 = PG_GETARG_TIMESTAMP(0);
 	Timestamp	dt2 = PG_GETARG_TIMESTAMP(1);
 
-	if (TIMESTAMP_NOT_FINITE(dt1) || TIMESTAMP_NOT_FINITE(dt2))
-		PG_RETURN_FLOAT8(0);
-
-	delta = dt2 - dt1;
-
-	Assert(delta >= 0);
+	/*
+	 * We know the values are range boundaries, but the range may be collapsed
+	 * (i.e. single points), with equal values.
+	 */
+	Assert(dt1 <= dt2);
 
-	PG_RETURN_FLOAT8(delta);
+	PG_RETURN_FLOAT8((double) dt2 - (double) dt1);
 }
 
 /*

Reply via email to