The btequalimage() header comment says:
* Generic "equalimage" support function.
*
* B-Tree operator classes whose equality function could safely be replaced by
* datum_image_eq() in all cases can use this as their "equalimage" support
* function.
interval_ops, however, recognizes equal-but-distinguishable values:
create temp table t (c interval);
insert into t values ('1d'::interval), ('24h');
table t;
select distinct c from t;
The CREATE INDEX of the following test:
begin;
create table t (c interval);
insert into t select x from generate_series(1,500), (values ('1 year 1
month'::interval), ('1 year 30 days')) t(x);
select distinct c from t;
create index ti on t (c);
rollback;
Fails with:
2498151 2023-10-10 05:06:46.177 GMT DEBUG: building index "ti" on table "t"
serially
2498151 2023-10-10 05:06:46.178 GMT DEBUG: index "ti" can safely use
deduplication
TRAP: failed Assert("!itup_key->allequalimage || keepnatts ==
_bt_keep_natts_fast(rel, lastleft, firstright)"), File: "nbtutils.c", Line:
2443, PID: 2498151
I've also caught btree posting lists where one TID refers to a '1d' heap
tuple, while another TID refers to a '24h' heap tuple. amcheck complains.
Index-only scans can return the '1d' bits where the actual tuple had the '24h'
bits. Are there other consequences to highlight in the release notes? The
back-branch patch is larger, to fix things without initdb. Hence, I'm
attaching patches for HEAD and for v16 (trivial to merge back from there). I
glanced at the other opfamilies permitting deduplication, and they look okay:
[local] test=*# select amproc, amproclefttype = amprocrighttype as l_eq_r,
array_agg(array[opfname, amproclefttype::regtype::text]) from pg_amproc join
pg_opfamily f on amprocfamily = f.oid where amprocnum = 4 and opfmethod = 403
group by 1,2;
─[ RECORD 1
]───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
amproc │ btequalimage
l_eq_r │ t
array_agg │
{{bit_ops,bit},{bool_ops,boolean},{bytea_ops,bytea},{char_ops,"\"char\""},{datetime_ops,date},{datetime_ops,"timestamp
without time zone"},{datetime_ops,"timestamp with time
zone"},{network_ops,inet},{integer_ops,smallint},{integer_ops,integer},{integer_ops,bigint},{interval_ops,interval},{macaddr_ops,macaddr},{oid_ops,oid},{oidvector_ops,oidvector},{time_ops,"time
without time zone"},{timetz_ops,"time with time zone"},{varbit_ops,"bit
varying"},{text_pattern_ops,text},{bpchar_pattern_ops,character},{money_ops,money},{tid_ops,tid},{uuid_ops,uuid},{pg_lsn_ops,pg_lsn},{macaddr8_ops,macaddr8},{enum_ops,anyenum},{xid8_ops,xid8}}
─[ RECORD 2
]───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
amproc │ btvarstrequalimage
l_eq_r │ t
array_agg │ {{bpchar_ops,character},{text_ops,text},{text_ops,name}}
Thanks,
nm
Author: Noah Misch <[email protected]>
Commit: Noah Misch <[email protected]>
Dissociate btequalimage() from interval_ops, ending its deduplication.
Under interval_ops, some equal values are distinguishable. One such
pair is '24:00:00' and '1 day'. With that being so, btequalimage()
breaches the documented contract for the "equalimage" btree support
function. This can cause incorrect results from index-only scans.
Users should REINDEX any indexes having interval-type columns and for
which "pg_amcheck --heapallindexed" reports an error. This fix makes
interval_ops simply omit the support function, like numeric_ops does.
Back-pack to v13, where btequalimage() first appeared. In back
branches, for the benefit of old catalog content, btequalimage() code
will return false for type "interval". Going forward, back-branch
initdb will include the catalog change.
Reviewed by FIXME.
Discussion: https://postgr.es/m/FIXME
diff --git a/src/include/catalog/pg_amproc.dat
b/src/include/catalog/pg_amproc.dat
index 5b95012..4c70da4 100644
--- a/src/include/catalog/pg_amproc.dat
+++ b/src/include/catalog/pg_amproc.dat
@@ -172,8 +172,6 @@
{ amprocfamily => 'btree/interval_ops', amproclefttype => 'interval',
amprocrighttype => 'interval', amprocnum => '3',
amproc => 'in_range(interval,interval,interval,bool,bool)' },
-{ amprocfamily => 'btree/interval_ops', amproclefttype => 'interval',
- amprocrighttype => 'interval', amprocnum => '4', amproc => 'btequalimage' },
{ amprocfamily => 'btree/macaddr_ops', amproclefttype => 'macaddr',
amprocrighttype => 'macaddr', amprocnum => '1', amproc => 'macaddr_cmp' },
{ amprocfamily => 'btree/macaddr_ops', amproclefttype => 'macaddr',
diff --git a/src/test/regress/expected/opr_sanity.out
b/src/test/regress/expected/opr_sanity.out
index a1bdf2c..7a6f36a 100644
--- a/src/test/regress/expected/opr_sanity.out
+++ b/src/test/regress/expected/opr_sanity.out
@@ -2208,6 +2208,7 @@ ORDER BY 1, 2, 3;
| array_ops | array_ops | anyarray
| float_ops | float4_ops | real
| float_ops | float8_ops | double precision
+ | interval_ops | interval_ops | interval
| jsonb_ops | jsonb_ops | jsonb
| multirange_ops | multirange_ops | anymultirange
| numeric_ops | numeric_ops | numeric
@@ -2216,7 +2217,7 @@ ORDER BY 1, 2, 3;
| record_ops | record_ops | record
| tsquery_ops | tsquery_ops | tsquery
| tsvector_ops | tsvector_ops | tsvector
-(15 rows)
+(16 rows)
-- **************** pg_index ****************
-- Look for illegal values in pg_index fields.
Author: Noah Misch <[email protected]>
Commit: Noah Misch <[email protected]>
Dissociate btequalimage() from interval_ops, ending its deduplication.
Under interval_ops, some equal values are distinguishable. One such
pair is '24:00:00' and '1 day'. With that being so, btequalimage()
breaches the documented contract for the "equalimage" btree support
function. This can cause incorrect results from index-only scans.
Users should REINDEX any indexes having interval-type columns and for
which "pg_amcheck --heapallindexed" reports an error. This fix makes
interval_ops simply omit the support function, like numeric_ops does.
Back-pack to v13, where btequalimage() first appeared. In back
branches, for the benefit of old catalog content, btequalimage() code
will return false for type "interval". Going forward, back-branch
initdb will include the catalog change.
Reviewed by FIXME.
Discussion: https://postgr.es/m/FIXME
diff --git a/src/backend/utils/adt/datum.c b/src/backend/utils/adt/datum.c
index 9f06ee7..251dd23 100644
--- a/src/backend/utils/adt/datum.c
+++ b/src/backend/utils/adt/datum.c
@@ -43,6 +43,7 @@
#include "postgres.h"
#include "access/detoast.h"
+#include "catalog/pg_type_d.h"
#include "common/hashfn.h"
#include "fmgr.h"
#include "utils/builtins.h"
@@ -385,20 +386,17 @@ datum_image_hash(Datum value, bool typByVal, int typLen)
* datum_image_eq() in all cases can use this as their "equalimage" support
* function.
*
- * Currently, we unconditionally assume that any B-Tree operator class that
- * registers btequalimage as its support function 4 must be able to safely use
- * optimizations like deduplication (i.e. we return true unconditionally). If
- * it ever proved necessary to rescind support for an operator class, we could
- * do that in a targeted fashion by doing something with the opcintype
- * argument.
+ * Earlier minor releases erroneously associated this function with
+ * interval_ops. Detect that case to rescind deduplication support, without
+ * requiring initdb.
*-------------------------------------------------------------------------
*/
Datum
btequalimage(PG_FUNCTION_ARGS)
{
- /* Oid opcintype = PG_GETARG_OID(0); */
+ Oid opcintype = PG_GETARG_OID(0);
- PG_RETURN_BOOL(true);
+ PG_RETURN_BOOL(opcintype != INTERVALOID);
}
/*-------------------------------------------------------------------------
diff --git a/src/include/catalog/pg_amproc.dat
b/src/include/catalog/pg_amproc.dat
index 5b95012..4c70da4 100644
--- a/src/include/catalog/pg_amproc.dat
+++ b/src/include/catalog/pg_amproc.dat
@@ -172,8 +172,6 @@
{ amprocfamily => 'btree/interval_ops', amproclefttype => 'interval',
amprocrighttype => 'interval', amprocnum => '3',
amproc => 'in_range(interval,interval,interval,bool,bool)' },
-{ amprocfamily => 'btree/interval_ops', amproclefttype => 'interval',
- amprocrighttype => 'interval', amprocnum => '4', amproc => 'btequalimage' },
{ amprocfamily => 'btree/macaddr_ops', amproclefttype => 'macaddr',
amprocrighttype => 'macaddr', amprocnum => '1', amproc => 'macaddr_cmp' },
{ amprocfamily => 'btree/macaddr_ops', amproclefttype => 'macaddr',
diff --git a/src/test/regress/expected/opr_sanity.out
b/src/test/regress/expected/opr_sanity.out
index a1bdf2c..7a6f36a 100644
--- a/src/test/regress/expected/opr_sanity.out
+++ b/src/test/regress/expected/opr_sanity.out
@@ -2208,6 +2208,7 @@ ORDER BY 1, 2, 3;
| array_ops | array_ops | anyarray
| float_ops | float4_ops | real
| float_ops | float8_ops | double precision
+ | interval_ops | interval_ops | interval
| jsonb_ops | jsonb_ops | jsonb
| multirange_ops | multirange_ops | anymultirange
| numeric_ops | numeric_ops | numeric
@@ -2216,7 +2217,7 @@ ORDER BY 1, 2, 3;
| record_ops | record_ops | record
| tsquery_ops | tsquery_ops | tsquery
| tsvector_ops | tsvector_ops | tsvector
-(15 rows)
+(16 rows)
-- **************** pg_index ****************
-- Look for illegal values in pg_index fields.