On Wed, Oct 11, 2023 at 01:00:44PM -0700, Peter Geoghegan wrote:
> On Wed, Oct 11, 2023 at 11:38 AM Noah Misch <n...@leadboat.com> wrote:
> > Interesting.  So, >99% of interval-type indexes, even ones WITH
> > (deduplicate_items=off), will get amcheck failures.  The <1% of exceptions
> > might include indexes having allequalimage=off due to an additional column,
> > e.g. a two-column (interval, numeric) index.  If interval indexes are common
> > enough and "pg_amcheck --heapallindexed" failures from $SUBJECT are 
> > relatively
> > rare, that could argue for giving amcheck a special case.  Specifically,
> > downgrade its "metapage incorrectly indicates that deduplication is safe" 
> > from
> > ERROR to WARNING for interval_ops only.
> 
> I am not aware of any user actually running "deduplicate_items = off"
> in production, for any index. It was added purely as a defensive thing
> -- not because I anticipated any real need to disable deduplication.
> Deduplication was optimized for being enabled by default.

Sure.  Low-importance background information: deduplicate_items=off got on my
radar while I was wondering if ALTER INDEX ... SET (deduplicate_items=off)
would clear allequalimage.  If it had, we could have advised people to use
ALTER INDEX, then rebuild only those indexes still failing "pg_amcheck
--heapallindexed".  ALTER INDEX doesn't do that, ruling out that idea.

> > Without that special case (i.e. with
> > the v1 patch), the release notes should probably resemble, "After updating,
> > run REINDEX on all indexes having an interval-type column."
> 
> +1
> 
> > There's little
> > point in recommending pg_amcheck if >99% will fail.  I'm inclined to bet 
> > that
> > interval-type indexes are rare, so I lean against adding the amcheck special
> > case.  It's not a strong preference.  Other opinions?

> exactly one case like that post-fix (interval_ops is at least the only
> affected core code opfamily), so why not point that out directly with
> a HINT? A HINT could go a long way towards putting the problem in
> context, without really adding a special case, and without any real
> question of users being misled.

Works for me.  Added.
Author:     Noah Misch <n...@leadboat.com>
Commit:     Noah Misch <n...@leadboat.com>

    Dissociate btequalimage() from interval_ops, ending its deduplication.
    
    Under interval_ops, some equal values are distinguishable.  One such
    pair is '24:00:00' and '1 day'.  With that being so, btequalimage()
    breaches the documented contract for the "equalimage" btree support
    function.  This can cause incorrect results from index-only scans.
    Users should REINDEX any btree indexes having interval-type columns.
    After updating, pg_amcheck will report an error for almost all such
    indexes.  This fix makes interval_ops simply omit the support function,
    like numeric_ops does.  Back-pack to v13, where btequalimage() first
    appeared.  In back branches, for the benefit of old catalog content,
    btequalimage() code will return false for type "interval".  Going
    forward, back-branch initdb will include the catalog change.
    
    Reviewed by Peter Geoghegan.
    
    Discussion: https://postgr.es/m/20231011013317.22.nmi...@google.com

diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index dbb83d8..3e07a3e 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -31,6 +31,7 @@
 #include "access/xact.h"
 #include "catalog/index.h"
 #include "catalog/pg_am.h"
+#include "catalog/pg_opfamily_d.h"
 #include "commands/tablecmds.h"
 #include "common/pg_prng.h"
 #include "lib/bloomfilter.h"
@@ -338,10 +339,20 @@ bt_index_check_internal(Oid indrelid, bool parentcheck, 
bool heapallindexed,
                                         errmsg("index \"%s\" metapage has 
equalimage field set on unsupported nbtree version",
                                                        
RelationGetRelationName(indrel))));
                if (allequalimage && !_bt_allequalimage(indrel, false))
+               {
+                       bool            has_interval_ops = false;
+
+                       for (int i = 0; i < 
IndexRelationGetNumberOfKeyAttributes(indrel); i++)
+                               if (indrel->rd_opfamily[i] == 
INTERVAL_BTREE_FAM_OID)
+                                       has_interval_ops = true;
                        ereport(ERROR,
                                        (errcode(ERRCODE_INDEX_CORRUPTED),
                                         errmsg("index \"%s\" metapage 
incorrectly indicates that deduplication is safe",
-                                                       
RelationGetRelationName(indrel))));
+                                                       
RelationGetRelationName(indrel)),
+                                        has_interval_ops
+                                        ? errhint("This is known of 
\"interval\" indexes last built on a version predating 2023-11.")
+                                        : 0));
+               }
 
                /* Check index, possibly against table it is an index on */
                bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
diff --git a/src/include/catalog/pg_amproc.dat 
b/src/include/catalog/pg_amproc.dat
index 5b95012..4c70da4 100644
--- a/src/include/catalog/pg_amproc.dat
+++ b/src/include/catalog/pg_amproc.dat
@@ -172,8 +172,6 @@
 { amprocfamily => 'btree/interval_ops', amproclefttype => 'interval',
   amprocrighttype => 'interval', amprocnum => '3',
   amproc => 'in_range(interval,interval,interval,bool,bool)' },
-{ amprocfamily => 'btree/interval_ops', amproclefttype => 'interval',
-  amprocrighttype => 'interval', amprocnum => '4', amproc => 'btequalimage' },
 { amprocfamily => 'btree/macaddr_ops', amproclefttype => 'macaddr',
   amprocrighttype => 'macaddr', amprocnum => '1', amproc => 'macaddr_cmp' },
 { amprocfamily => 'btree/macaddr_ops', amproclefttype => 'macaddr',
diff --git a/src/include/catalog/pg_opfamily.dat 
b/src/include/catalog/pg_opfamily.dat
index 91587b9..81a8525 100644
--- a/src/include/catalog/pg_opfamily.dat
+++ b/src/include/catalog/pg_opfamily.dat
@@ -50,7 +50,7 @@
   opfmethod => 'btree', opfname => 'integer_ops' },
 { oid => '1977',
   opfmethod => 'hash', opfname => 'integer_ops' },
-{ oid => '1982',
+{ oid => '1982', oid_symbol => 'INTERVAL_BTREE_FAM_OID',
   opfmethod => 'btree', opfname => 'interval_ops' },
 { oid => '1983',
   opfmethod => 'hash', opfname => 'interval_ops' },
diff --git a/src/test/regress/expected/opr_sanity.out 
b/src/test/regress/expected/opr_sanity.out
index a1bdf2c..7a6f36a 100644
--- a/src/test/regress/expected/opr_sanity.out
+++ b/src/test/regress/expected/opr_sanity.out
@@ -2208,6 +2208,7 @@ ORDER BY 1, 2, 3;
                     | array_ops        | array_ops        | anyarray
                     | float_ops        | float4_ops       | real
                     | float_ops        | float8_ops       | double precision
+                    | interval_ops     | interval_ops     | interval
                     | jsonb_ops        | jsonb_ops        | jsonb
                     | multirange_ops   | multirange_ops   | anymultirange
                     | numeric_ops      | numeric_ops      | numeric
@@ -2216,7 +2217,7 @@ ORDER BY 1, 2, 3;
                     | record_ops       | record_ops       | record
                     | tsquery_ops      | tsquery_ops      | tsquery
                     | tsvector_ops     | tsvector_ops     | tsvector
-(15 rows)
+(16 rows)
 
 -- **************** pg_index ****************
 -- Look for illegal values in pg_index fields.
Author:     Noah Misch <n...@leadboat.com>
Commit:     Noah Misch <n...@leadboat.com>

    Dissociate btequalimage() from interval_ops, ending its deduplication.
    
    Under interval_ops, some equal values are distinguishable.  One such
    pair is '24:00:00' and '1 day'.  With that being so, btequalimage()
    breaches the documented contract for the "equalimage" btree support
    function.  This can cause incorrect results from index-only scans.
    Users should REINDEX any btree indexes having interval-type columns.
    After updating, pg_amcheck will report an error for almost all such
    indexes.  This fix makes interval_ops simply omit the support function,
    like numeric_ops does.  Back-pack to v13, where btequalimage() first
    appeared.  In back branches, for the benefit of old catalog content,
    btequalimage() code will return false for type "interval".  Going
    forward, back-branch initdb will include the catalog change.
    
    Reviewed by Peter Geoghegan.
    
    Discussion: https://postgr.es/m/20231011013317.22.nmi...@google.com

diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index dbb83d8..3e07a3e 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -31,6 +31,7 @@
 #include "access/xact.h"
 #include "catalog/index.h"
 #include "catalog/pg_am.h"
+#include "catalog/pg_opfamily_d.h"
 #include "commands/tablecmds.h"
 #include "common/pg_prng.h"
 #include "lib/bloomfilter.h"
@@ -338,10 +339,20 @@ bt_index_check_internal(Oid indrelid, bool parentcheck, 
bool heapallindexed,
                                         errmsg("index \"%s\" metapage has 
equalimage field set on unsupported nbtree version",
                                                        
RelationGetRelationName(indrel))));
                if (allequalimage && !_bt_allequalimage(indrel, false))
+               {
+                       bool            has_interval_ops = false;
+
+                       for (int i = 0; i < 
IndexRelationGetNumberOfKeyAttributes(indrel); i++)
+                               if (indrel->rd_opfamily[i] == 
INTERVAL_BTREE_FAM_OID)
+                                       has_interval_ops = true;
                        ereport(ERROR,
                                        (errcode(ERRCODE_INDEX_CORRUPTED),
                                         errmsg("index \"%s\" metapage 
incorrectly indicates that deduplication is safe",
-                                                       
RelationGetRelationName(indrel))));
+                                                       
RelationGetRelationName(indrel)),
+                                        has_interval_ops
+                                        ? errhint("This is known of 
\"interval\" indexes last built on a version predating 2023-11.")
+                                        : 0));
+               }
 
                /* Check index, possibly against table it is an index on */
                bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
diff --git a/src/backend/utils/adt/datum.c b/src/backend/utils/adt/datum.c
index 9f06ee7..251dd23 100644
--- a/src/backend/utils/adt/datum.c
+++ b/src/backend/utils/adt/datum.c
@@ -43,6 +43,7 @@
 #include "postgres.h"
 
 #include "access/detoast.h"
+#include "catalog/pg_type_d.h"
 #include "common/hashfn.h"
 #include "fmgr.h"
 #include "utils/builtins.h"
@@ -385,20 +386,17 @@ datum_image_hash(Datum value, bool typByVal, int typLen)
  * datum_image_eq() in all cases can use this as their "equalimage" support
  * function.
  *
- * Currently, we unconditionally assume that any B-Tree operator class that
- * registers btequalimage as its support function 4 must be able to safely use
- * optimizations like deduplication (i.e. we return true unconditionally).  If
- * it ever proved necessary to rescind support for an operator class, we could
- * do that in a targeted fashion by doing something with the opcintype
- * argument.
+ * Earlier minor releases erroneously associated this function with
+ * interval_ops.  Detect that case to rescind deduplication support, without
+ * requiring initdb.
  *-------------------------------------------------------------------------
  */
 Datum
 btequalimage(PG_FUNCTION_ARGS)
 {
-       /* Oid          opcintype = PG_GETARG_OID(0); */
+       Oid                     opcintype = PG_GETARG_OID(0);
 
-       PG_RETURN_BOOL(true);
+       PG_RETURN_BOOL(opcintype != INTERVALOID);
 }
 
 /*-------------------------------------------------------------------------
diff --git a/src/include/catalog/pg_amproc.dat 
b/src/include/catalog/pg_amproc.dat
index 5b95012..4c70da4 100644
--- a/src/include/catalog/pg_amproc.dat
+++ b/src/include/catalog/pg_amproc.dat
@@ -172,8 +172,6 @@
 { amprocfamily => 'btree/interval_ops', amproclefttype => 'interval',
   amprocrighttype => 'interval', amprocnum => '3',
   amproc => 'in_range(interval,interval,interval,bool,bool)' },
-{ amprocfamily => 'btree/interval_ops', amproclefttype => 'interval',
-  amprocrighttype => 'interval', amprocnum => '4', amproc => 'btequalimage' },
 { amprocfamily => 'btree/macaddr_ops', amproclefttype => 'macaddr',
   amprocrighttype => 'macaddr', amprocnum => '1', amproc => 'macaddr_cmp' },
 { amprocfamily => 'btree/macaddr_ops', amproclefttype => 'macaddr',
diff --git a/src/include/catalog/pg_opfamily.dat 
b/src/include/catalog/pg_opfamily.dat
index 91587b9..81a8525 100644
--- a/src/include/catalog/pg_opfamily.dat
+++ b/src/include/catalog/pg_opfamily.dat
@@ -50,7 +50,7 @@
   opfmethod => 'btree', opfname => 'integer_ops' },
 { oid => '1977',
   opfmethod => 'hash', opfname => 'integer_ops' },
-{ oid => '1982',
+{ oid => '1982', oid_symbol => 'INTERVAL_BTREE_FAM_OID',
   opfmethod => 'btree', opfname => 'interval_ops' },
 { oid => '1983',
   opfmethod => 'hash', opfname => 'interval_ops' },
diff --git a/src/test/regress/expected/opr_sanity.out 
b/src/test/regress/expected/opr_sanity.out
index a1bdf2c..7a6f36a 100644
--- a/src/test/regress/expected/opr_sanity.out
+++ b/src/test/regress/expected/opr_sanity.out
@@ -2208,6 +2208,7 @@ ORDER BY 1, 2, 3;
                     | array_ops        | array_ops        | anyarray
                     | float_ops        | float4_ops       | real
                     | float_ops        | float8_ops       | double precision
+                    | interval_ops     | interval_ops     | interval
                     | jsonb_ops        | jsonb_ops        | jsonb
                     | multirange_ops   | multirange_ops   | anymultirange
                     | numeric_ops      | numeric_ops      | numeric
@@ -2216,7 +2217,7 @@ ORDER BY 1, 2, 3;
                     | record_ops       | record_ops       | record
                     | tsquery_ops      | tsquery_ops      | tsquery
                     | tsvector_ops     | tsvector_ops     | tsvector
-(15 rows)
+(16 rows)
 
 -- **************** pg_index ****************
 -- Look for illegal values in pg_index fields.

Reply via email to