Attached is a patch for a bit of infrastructure I believe to be
necessary for correct behavior of REFRESH MATERIALIZED VIEW
CONCURRENTLY as well as incremental maintenance of matviews.
The idea is that after RMVC or incremental maintenance, the matview
should not be visibly different that it would have been at creation
or on a non-concurrent REFRESH. The issue is easy to demonstrate
with citext, but anywhere that the = operator allows user-visible
differences between "equal" values it can be an issue.
test=# CREATE TABLE citext_table (
test(# id serial primary key,
test(# name citext
test(# );
CREATE TABLE
test=# INSERT INTO citext_table (name)
test-# VALUES ('one'), ('two'), ('three'), (NULL), (NULL);
INSERT 0 5
test=# CREATE MATERIALIZED VIEW citext_matview AS
test-# SELECT * FROM citext_table;
SELECT 5
test=# CREATE UNIQUE INDEX citext_matview_id
test-# ON citext_matview (id);
CREATE INDEX
test=# UPDATE citext_table SET name = 'Two' WHERE name = 'TWO';
UPDATE 1
At this point, the table and the matview have visibly different
values, yet without the patch the query used to find differences
for RMVC would be essentially like this (slightly simplified for
readability):
test=# SELECT *
FROM citext_matview m
FULL JOIN citext_table t ON (t.id = m.id AND t = m)
WHERE t IS NULL OR m IS NULL;
id | name | id | name
----+------+----+------
(0 rows)
No differences were found, so without this patch, the matview would
remain visibly different from the results generated by a run of its
defining query.
The patch adds an "identical" operator (===) for the record type:
test=# SELECT *
FROM citext_matview m
FULL JOIN citext_table t ON (t.id = m.id AND t === m)
WHERE t IS NULL OR m IS NULL;
id | name | id | name
----+------+----+------
| | 2 | Two
2 | two | |
(2 rows)
The difference is now found, so RMVC makes the appropriate change.
test=# REFRESH MATERIALIZED VIEW CONCURRENTLY citext_matview;
REFRESH MATERIALIZED VIEW
test=# SELECT * FROM citext_matview ORDER BY id;
id | name
----+-------
1 | one
2 | Two
3 | three
4 |
5 |
(5 rows)
The patch adds all of the functions, operators, and catalog
information to support merge joins using the "identical" operator.
The new operator is logically similar to IS NOT DISTINCT FROM for a
record, although its implementation is very different. For one
thing, it doesn't replace the operation with column level operators
in the parser. For another thing, it doesn't look up operators for
each type, so the "identical" operator does not need to be
implemented for each type to use it as shown above. It compares
values byte-for-byte, after detoasting. The test for identical
records can avoid the detoasting altogether for any values with
different lengths, and it stops when it finds the first column with
a difference.
I toyed with the idea of supporting hashing of records using this
operator, but could not see how that would be a performance win.
The identical (===) and not identical (!==) operator names were
chosen because of a vague similarity to the "exactly equals"
concepts in JavaScript and PHP, which use that name. The semantics
aren't quite the same, but it seemed close enough not to be too
surprising. The additional operator names seemed natural to me
based on the first two, but I'm not really that attached to these
names for the operators if someone has a better idea.
Since the comparison of record values is not documented (only
comparisons involving row value constructors), it doesn't seem like
we should document this special case. It is intended primarily for
support of matview refresh and maintenance, and it seems likely
that record comparison was not documented on the basis that it is
intended primarily for support of such things as indexing and merge
joins -- so leaving the new operators undocumented seems consistent
with existing policy. I'm open to arguments that the policy should
change.
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
*** a/contrib/citext/expected/citext.out
--- b/contrib/citext/expected/citext.out
***************
*** 2276,2278 **** SELECT like_escape( name::text, ''::citext ) = like_escape( name::text, '' ) AS
--- 2276,2319 ----
t
(5 rows)
+ -- Ensure correct behavior for citext with materialized views.
+ CREATE TABLE citext_table (
+ id serial primary key,
+ name citext
+ );
+ INSERT INTO citext_table (name)
+ VALUES ('one'), ('two'), ('three'), (NULL), (NULL);
+ CREATE MATERIALIZED VIEW citext_matview AS
+ SELECT * FROM citext_table;
+ CREATE UNIQUE INDEX citext_matview_id
+ ON citext_matview (id);
+ SELECT *
+ FROM citext_matview m
+ FULL JOIN citext_table t ON (t.id = m.id AND t === m)
+ WHERE t.id IS NULL OR m.id IS NULL;
+ id | name | id | name
+ ----+------+----+------
+ (0 rows)
+
+ UPDATE citext_table SET name = 'Two' WHERE name = 'TWO';
+ SELECT *
+ FROM citext_matview m
+ FULL JOIN citext_table t ON (t.id = m.id AND t === m)
+ WHERE t.id IS NULL OR m.id IS NULL;
+ id | name | id | name
+ ----+------+----+------
+ | | 2 | Two
+ 2 | two | |
+ (2 rows)
+
+ REFRESH MATERIALIZED VIEW CONCURRENTLY citext_matview;
+ SELECT * FROM citext_matview ORDER BY id;
+ id | name
+ ----+-------
+ 1 | one
+ 2 | Two
+ 3 | three
+ 4 |
+ 5 |
+ (5 rows)
+
*** a/contrib/citext/expected/citext_1.out
--- b/contrib/citext/expected/citext_1.out
***************
*** 2276,2278 **** SELECT like_escape( name::text, ''::citext ) = like_escape( name::text, '' ) AS
--- 2276,2319 ----
t
(5 rows)
+ -- Ensure correct behavior for citext with materialized views.
+ CREATE TABLE citext_table (
+ id serial primary key,
+ name citext
+ );
+ INSERT INTO citext_table (name)
+ VALUES ('one'), ('two'), ('three'), (NULL), (NULL);
+ CREATE MATERIALIZED VIEW citext_matview AS
+ SELECT * FROM citext_table;
+ CREATE UNIQUE INDEX citext_matview_id
+ ON citext_matview (id);
+ SELECT *
+ FROM citext_matview m
+ FULL JOIN citext_table t ON (t.id = m.id AND t === m)
+ WHERE t.id IS NULL OR m.id IS NULL;
+ id | name | id | name
+ ----+------+----+------
+ (0 rows)
+
+ UPDATE citext_table SET name = 'Two' WHERE name = 'TWO';
+ SELECT *
+ FROM citext_matview m
+ FULL JOIN citext_table t ON (t.id = m.id AND t === m)
+ WHERE t.id IS NULL OR m.id IS NULL;
+ id | name | id | name
+ ----+------+----+------
+ | | 2 | Two
+ 2 | two | |
+ (2 rows)
+
+ REFRESH MATERIALIZED VIEW CONCURRENTLY citext_matview;
+ SELECT * FROM citext_matview ORDER BY id;
+ id | name
+ ----+-------
+ 1 | one
+ 2 | Two
+ 3 | three
+ 4 |
+ 5 |
+ (5 rows)
+
*** a/contrib/citext/sql/citext.sql
--- b/contrib/citext/sql/citext.sql
***************
*** 711,713 **** SELECT COUNT(*) = 19::bigint AS t FROM try;
--- 711,736 ----
SELECT like_escape( name, '' ) = like_escape( name::text, '' ) AS t FROM srt;
SELECT like_escape( name::text, ''::citext ) = like_escape( name::text, '' ) AS t FROM srt;
+
+ -- Ensure correct behavior for citext with materialized views.
+ CREATE TABLE citext_table (
+ id serial primary key,
+ name citext
+ );
+ INSERT INTO citext_table (name)
+ VALUES ('one'), ('two'), ('three'), (NULL), (NULL);
+ CREATE MATERIALIZED VIEW citext_matview AS
+ SELECT * FROM citext_table;
+ CREATE UNIQUE INDEX citext_matview_id
+ ON citext_matview (id);
+ SELECT *
+ FROM citext_matview m
+ FULL JOIN citext_table t ON (t.id = m.id AND t === m)
+ WHERE t.id IS NULL OR m.id IS NULL;
+ UPDATE citext_table SET name = 'Two' WHERE name = 'TWO';
+ SELECT *
+ FROM citext_matview m
+ FULL JOIN citext_table t ON (t.id = m.id AND t === m)
+ WHERE t.id IS NULL OR m.id IS NULL;
+ REFRESH MATERIALIZED VIEW CONCURRENTLY citext_matview;
+ SELECT * FROM citext_matview ORDER BY id;
*** a/src/backend/commands/matview.c
--- b/src/backend/commands/matview.c
***************
*** 562,568 **** refresh_by_match_merge(Oid matviewOid, Oid tempOid)
"SELECT newdata FROM %s newdata "
"WHERE newdata IS NOT NULL AND EXISTS "
"(SELECT * FROM %s newdata2 WHERE newdata2 IS NOT NULL "
! "AND newdata2 OPERATOR(pg_catalog.=) newdata "
"AND newdata2.ctid OPERATOR(pg_catalog.<>) "
"newdata.ctid) LIMIT 1",
tempname, tempname);
--- 562,568 ----
"SELECT newdata FROM %s newdata "
"WHERE newdata IS NOT NULL AND EXISTS "
"(SELECT * FROM %s newdata2 WHERE newdata2 IS NOT NULL "
! "AND newdata2 OPERATOR(pg_catalog.===) newdata "
"AND newdata2.ctid OPERATOR(pg_catalog.<>) "
"newdata.ctid) LIMIT 1",
tempname, tempname);
***************
*** 682,689 **** refresh_by_match_merge(Oid matviewOid, Oid tempOid)
errhint("Create a UNIQUE index with no WHERE clause on one or more columns of the materialized view.")));
appendStringInfoString(&querybuf,
! " AND newdata = mv) WHERE newdata IS NULL OR mv IS NULL"
! " ORDER BY tid");
/* Create the temporary "diff" table. */
if (SPI_exec(querybuf.data, 0) != SPI_OK_UTILITY)
--- 682,690 ----
errhint("Create a UNIQUE index with no WHERE clause on one or more columns of the materialized view.")));
appendStringInfoString(&querybuf,
! " AND newdata OPERATOR(pg_catalog.===) mv) "
! "WHERE newdata IS NULL OR mv IS NULL "
! "ORDER BY tid");
/* Create the temporary "diff" table. */
if (SPI_exec(querybuf.data, 0) != SPI_OK_UTILITY)
*** a/src/backend/utils/adt/rowtypes.c
--- b/src/backend/utils/adt/rowtypes.c
***************
*** 17,22 ****
--- 17,23 ----
#include <ctype.h>
#include "access/htup_details.h"
+ #include "access/tuptoaster.h"
#include "catalog/pg_type.h"
#include "libpq/pqformat.h"
#include "utils/builtins.h"
***************
*** 1281,1283 **** btrecordcmp(PG_FUNCTION_ARGS)
--- 1282,1765 ----
{
PG_RETURN_INT32(record_cmp(fcinfo));
}
+
+
+ /*
+ * record_image_cmp :
+ * Internal byte-oriented comparison function for records.
+ *
+ * Returns -1, 0 or 1
+ *
+ * Note: The normal concepts of "equality" do not apply here; different
+ * representation of values considered to be equal are not considered to be
+ * identical. As an example, for the citext type 'A' and 'a' are equal, but
+ * they are not identical.
+ */
+ static bool
+ record_image_cmp(PG_FUNCTION_ARGS)
+ {
+ HeapTupleHeader record1 = PG_GETARG_HEAPTUPLEHEADER(0);
+ HeapTupleHeader record2 = PG_GETARG_HEAPTUPLEHEADER(1);
+ int32 result = 0;
+ Oid tupType1;
+ Oid tupType2;
+ int32 tupTypmod1;
+ int32 tupTypmod2;
+ TupleDesc tupdesc1;
+ TupleDesc tupdesc2;
+ HeapTupleData tuple1;
+ HeapTupleData tuple2;
+ int ncolumns1;
+ int ncolumns2;
+ RecordCompareData *my_extra;
+ int ncols;
+ Datum *values1;
+ Datum *values2;
+ bool *nulls1;
+ bool *nulls2;
+ int i1;
+ int i2;
+ int j;
+
+ /* Extract type info from the tuples */
+ tupType1 = HeapTupleHeaderGetTypeId(record1);
+ tupTypmod1 = HeapTupleHeaderGetTypMod(record1);
+ tupdesc1 = lookup_rowtype_tupdesc(tupType1, tupTypmod1);
+ ncolumns1 = tupdesc1->natts;
+ tupType2 = HeapTupleHeaderGetTypeId(record2);
+ tupTypmod2 = HeapTupleHeaderGetTypMod(record2);
+ tupdesc2 = lookup_rowtype_tupdesc(tupType2, tupTypmod2);
+ ncolumns2 = tupdesc2->natts;
+
+ /* Build temporary HeapTuple control structures */
+ tuple1.t_len = HeapTupleHeaderGetDatumLength(record1);
+ ItemPointerSetInvalid(&(tuple1.t_self));
+ tuple1.t_tableOid = InvalidOid;
+ tuple1.t_data = record1;
+ tuple2.t_len = HeapTupleHeaderGetDatumLength(record2);
+ ItemPointerSetInvalid(&(tuple2.t_self));
+ tuple2.t_tableOid = InvalidOid;
+ tuple2.t_data = record2;
+
+ /*
+ * We arrange to look up the needed comparison info just once per series
+ * of calls, assuming the record types don't change underneath us.
+ */
+ ncols = Max(ncolumns1, ncolumns2);
+ my_extra = (RecordCompareData *) fcinfo->flinfo->fn_extra;
+ if (my_extra == NULL ||
+ my_extra->ncolumns < ncols)
+ {
+ fcinfo->flinfo->fn_extra =
+ MemoryContextAlloc(fcinfo->flinfo->fn_mcxt,
+ sizeof(RecordCompareData) - sizeof(ColumnCompareData)
+ + ncols * sizeof(ColumnCompareData));
+ my_extra = (RecordCompareData *) fcinfo->flinfo->fn_extra;
+ my_extra->ncolumns = ncols;
+ my_extra->record1_type = InvalidOid;
+ my_extra->record1_typmod = 0;
+ my_extra->record2_type = InvalidOid;
+ my_extra->record2_typmod = 0;
+ }
+
+ if (my_extra->record1_type != tupType1 ||
+ my_extra->record1_typmod != tupTypmod1 ||
+ my_extra->record2_type != tupType2 ||
+ my_extra->record2_typmod != tupTypmod2)
+ {
+ MemSet(my_extra->columns, 0, ncols * sizeof(ColumnCompareData));
+ my_extra->record1_type = tupType1;
+ my_extra->record1_typmod = tupTypmod1;
+ my_extra->record2_type = tupType2;
+ my_extra->record2_typmod = tupTypmod2;
+ }
+
+ /* Break down the tuples into fields */
+ values1 = (Datum *) palloc(ncolumns1 * sizeof(Datum));
+ nulls1 = (bool *) palloc(ncolumns1 * sizeof(bool));
+ heap_deform_tuple(&tuple1, tupdesc1, values1, nulls1);
+ values2 = (Datum *) palloc(ncolumns2 * sizeof(Datum));
+ nulls2 = (bool *) palloc(ncolumns2 * sizeof(bool));
+ heap_deform_tuple(&tuple2, tupdesc2, values2, nulls2);
+
+ /*
+ * Scan corresponding columns, allowing for dropped columns in different
+ * places in the two rows. i1 and i2 are physical column indexes, j is
+ * the logical column index.
+ */
+ i1 = i2 = j = 0;
+ while (i1 < ncolumns1 || i2 < ncolumns2)
+ {
+ /*
+ * Skip dropped columns
+ */
+ if (i1 < ncolumns1 && tupdesc1->attrs[i1]->attisdropped)
+ {
+ i1++;
+ continue;
+ }
+ if (i2 < ncolumns2 && tupdesc2->attrs[i2]->attisdropped)
+ {
+ i2++;
+ continue;
+ }
+ if (i1 >= ncolumns1 || i2 >= ncolumns2)
+ break; /* we'll deal with mismatch below loop */
+
+ /*
+ * Have two matching columns, they must be same type
+ */
+ if (tupdesc1->attrs[i1]->atttypid !=
+ tupdesc2->attrs[i2]->atttypid)
+ ereport(ERROR,
+ (errcode(ERRCODE_DATATYPE_MISMATCH),
+ errmsg("cannot compare dissimilar column types %s and %s at record column %d",
+ format_type_be(tupdesc1->attrs[i1]->atttypid),
+ format_type_be(tupdesc2->attrs[i2]->atttypid),
+ j + 1)));
+
+ /*
+ * We consider two NULLs equal; NULL > not-NULL.
+ */
+ if (!nulls1[i1] || !nulls2[i2])
+ {
+ int cmpresult;
+
+ if (nulls1[i1])
+ {
+ /* arg1 is greater than arg2 */
+ result = 1;
+ break;
+ }
+ if (nulls2[i2])
+ {
+ /* arg1 is less than arg2 */
+ result = -1;
+ break;
+ }
+
+ /* Compare the pair of elements */
+ if (tupdesc1->attrs[i1]->attlen == -1)
+ {
+ Size len1,
+ len2;
+ struct varlena *arg1val;
+ struct varlena *arg2val;
+
+ len1 = toast_raw_datum_size(values1[i1]);
+ len2 = toast_raw_datum_size(values2[i2]);
+ arg1val = PG_DETOAST_DATUM_PACKED(values1[i1]);
+ arg2val = PG_DETOAST_DATUM_PACKED(values2[i2]);
+
+ cmpresult = memcmp(VARDATA_ANY(arg1val),
+ VARDATA_ANY(arg2val),
+ len1 - VARHDRSZ);
+ if ((cmpresult == 0) && (len1 != len2))
+ cmpresult = (len1 < len2) ? -1 : 1;
+
+ if ((Pointer) arg1val != (Pointer) values1[i1])
+ pfree(arg1val);
+ if ((Pointer) arg2val != (Pointer) values2[i2])
+ pfree(arg2val);
+ }
+ else
+ {
+ cmpresult = memcmp(&(values1[i1]),
+ &(values2[i2]),
+ tupdesc1->attrs[i1]->attlen);
+ }
+
+ if (cmpresult < 0)
+ {
+ /* arg1 is less than arg2 */
+ result = -1;
+ break;
+ }
+ else if (cmpresult > 0)
+ {
+ /* arg1 is greater than arg2 */
+ result = 1;
+ break;
+ }
+ }
+
+ /* equal, so continue to next column */
+ i1++, i2++, j++;
+ }
+
+ /*
+ * If we didn't break out of the loop early, check for column count
+ * mismatch. (We do not report such mismatch if we found unequal column
+ * values; is that a feature or a bug?)
+ */
+ if (result == 0)
+ {
+ if (i1 != ncolumns1 || i2 != ncolumns2)
+ ereport(ERROR,
+ (errcode(ERRCODE_DATATYPE_MISMATCH),
+ errmsg("cannot compare record types with different numbers of columns")));
+ }
+
+ pfree(values1);
+ pfree(nulls1);
+ pfree(values2);
+ pfree(nulls2);
+ ReleaseTupleDesc(tupdesc1);
+ ReleaseTupleDesc(tupdesc2);
+
+ /* Avoid leaking memory when handed toasted input. */
+ PG_FREE_IF_COPY(record1, 0);
+ PG_FREE_IF_COPY(record2, 1);
+
+ return result;
+ }
+
+ /*
+ * record_image_eq :
+ * compares two records for identical contents, based on byte images
+ * result :
+ * returns true if the records are identical, false otherwise.
+ *
+ * Note: we do not use record_image_cmp here, since we can avoid
+ * de-toasting for unequal lengths this way.
+ */
+ Datum
+ record_image_eq(PG_FUNCTION_ARGS)
+ {
+ HeapTupleHeader record1 = PG_GETARG_HEAPTUPLEHEADER(0);
+ HeapTupleHeader record2 = PG_GETARG_HEAPTUPLEHEADER(1);
+ bool result = true;
+ Oid tupType1;
+ Oid tupType2;
+ int32 tupTypmod1;
+ int32 tupTypmod2;
+ TupleDesc tupdesc1;
+ TupleDesc tupdesc2;
+ HeapTupleData tuple1;
+ HeapTupleData tuple2;
+ int ncolumns1;
+ int ncolumns2;
+ RecordCompareData *my_extra;
+ int ncols;
+ Datum *values1;
+ Datum *values2;
+ bool *nulls1;
+ bool *nulls2;
+ int i1;
+ int i2;
+ int j;
+
+ /* Extract type info from the tuples */
+ tupType1 = HeapTupleHeaderGetTypeId(record1);
+ tupTypmod1 = HeapTupleHeaderGetTypMod(record1);
+ tupdesc1 = lookup_rowtype_tupdesc(tupType1, tupTypmod1);
+ ncolumns1 = tupdesc1->natts;
+ tupType2 = HeapTupleHeaderGetTypeId(record2);
+ tupTypmod2 = HeapTupleHeaderGetTypMod(record2);
+ tupdesc2 = lookup_rowtype_tupdesc(tupType2, tupTypmod2);
+ ncolumns2 = tupdesc2->natts;
+
+ /* Build temporary HeapTuple control structures */
+ tuple1.t_len = HeapTupleHeaderGetDatumLength(record1);
+ ItemPointerSetInvalid(&(tuple1.t_self));
+ tuple1.t_tableOid = InvalidOid;
+ tuple1.t_data = record1;
+ tuple2.t_len = HeapTupleHeaderGetDatumLength(record2);
+ ItemPointerSetInvalid(&(tuple2.t_self));
+ tuple2.t_tableOid = InvalidOid;
+ tuple2.t_data = record2;
+
+ /*
+ * We arrange to look up the needed comparison info just once per series
+ * of calls, assuming the record types don't change underneath us.
+ */
+ ncols = Max(ncolumns1, ncolumns2);
+ my_extra = (RecordCompareData *) fcinfo->flinfo->fn_extra;
+ if (my_extra == NULL ||
+ my_extra->ncolumns < ncols)
+ {
+ fcinfo->flinfo->fn_extra =
+ MemoryContextAlloc(fcinfo->flinfo->fn_mcxt,
+ sizeof(RecordCompareData) - sizeof(ColumnCompareData)
+ + ncols * sizeof(ColumnCompareData));
+ my_extra = (RecordCompareData *) fcinfo->flinfo->fn_extra;
+ my_extra->ncolumns = ncols;
+ my_extra->record1_type = InvalidOid;
+ my_extra->record1_typmod = 0;
+ my_extra->record2_type = InvalidOid;
+ my_extra->record2_typmod = 0;
+ }
+
+ if (my_extra->record1_type != tupType1 ||
+ my_extra->record1_typmod != tupTypmod1 ||
+ my_extra->record2_type != tupType2 ||
+ my_extra->record2_typmod != tupTypmod2)
+ {
+ MemSet(my_extra->columns, 0, ncols * sizeof(ColumnCompareData));
+ my_extra->record1_type = tupType1;
+ my_extra->record1_typmod = tupTypmod1;
+ my_extra->record2_type = tupType2;
+ my_extra->record2_typmod = tupTypmod2;
+ }
+
+ /* Break down the tuples into fields */
+ values1 = (Datum *) palloc(ncolumns1 * sizeof(Datum));
+ nulls1 = (bool *) palloc(ncolumns1 * sizeof(bool));
+ heap_deform_tuple(&tuple1, tupdesc1, values1, nulls1);
+ values2 = (Datum *) palloc(ncolumns2 * sizeof(Datum));
+ nulls2 = (bool *) palloc(ncolumns2 * sizeof(bool));
+ heap_deform_tuple(&tuple2, tupdesc2, values2, nulls2);
+
+ /*
+ * Scan corresponding columns, allowing for dropped columns in different
+ * places in the two rows. i1 and i2 are physical column indexes, j is
+ * the logical column index.
+ */
+ i1 = i2 = j = 0;
+ while (i1 < ncolumns1 || i2 < ncolumns2)
+ {
+ /*
+ * Skip dropped columns
+ */
+ if (i1 < ncolumns1 && tupdesc1->attrs[i1]->attisdropped)
+ {
+ i1++;
+ continue;
+ }
+ if (i2 < ncolumns2 && tupdesc2->attrs[i2]->attisdropped)
+ {
+ i2++;
+ continue;
+ }
+ if (i1 >= ncolumns1 || i2 >= ncolumns2)
+ break; /* we'll deal with mismatch below loop */
+
+ /*
+ * Have two matching columns, they must be same type
+ */
+ if (tupdesc1->attrs[i1]->atttypid !=
+ tupdesc2->attrs[i2]->atttypid)
+ ereport(ERROR,
+ (errcode(ERRCODE_DATATYPE_MISMATCH),
+ errmsg("cannot compare dissimilar column types %s and %s at record column %d",
+ format_type_be(tupdesc1->attrs[i1]->atttypid),
+ format_type_be(tupdesc2->attrs[i2]->atttypid),
+ j + 1)));
+
+ /*
+ * We consider two NULLs equal; NULL > not-NULL.
+ */
+ if (!nulls1[i1] || !nulls2[i2])
+ {
+ if (nulls1[i1] || nulls2[i2])
+ {
+ result = false;
+ break;
+ }
+
+ /* Compare the pair of elements */
+ if (tupdesc1->attrs[i1]->attlen == -1)
+ {
+ Size len1,
+ len2;
+
+ len1 = toast_raw_datum_size(values1[i1]);
+ len2 = toast_raw_datum_size(values2[i2]);
+ /* No need to de-toast if lengths don't match. */
+ if (len1 != len2)
+ result = false;
+ else
+ {
+ struct varlena *arg1val;
+ struct varlena *arg2val;
+
+ arg1val = PG_DETOAST_DATUM_PACKED(values1[i1]);
+ arg2val = PG_DETOAST_DATUM_PACKED(values2[i2]);
+
+ result = (memcmp(VARDATA_ANY(arg1val),
+ VARDATA_ANY(arg2val),
+ len1 - VARHDRSZ) == 0);
+
+ /* Only free memory if it's a copy made here. */
+ if ((Pointer) arg1val != (Pointer) values1[i1])
+ pfree(arg1val);
+ if ((Pointer) arg2val != (Pointer) values2[i2])
+ pfree(arg2val);
+ }
+ }
+ else
+ {
+ result = (memcmp(&(values1[i1]),
+ &(values2[i2]),
+ tupdesc1->attrs[i1]->attlen) == 0);
+ }
+ if (!result)
+ break;
+ }
+
+ /* equal, so continue to next column */
+ i1++, i2++, j++;
+ }
+
+ /*
+ * If we didn't break out of the loop early, check for column count
+ * mismatch. (We do not report such mismatch if we found unequal column
+ * values; is that a feature or a bug?)
+ */
+ if (result)
+ {
+ if (i1 != ncolumns1 || i2 != ncolumns2)
+ ereport(ERROR,
+ (errcode(ERRCODE_DATATYPE_MISMATCH),
+ errmsg("cannot compare record types with different numbers of columns")));
+ }
+
+ pfree(values1);
+ pfree(nulls1);
+ pfree(values2);
+ pfree(nulls2);
+ ReleaseTupleDesc(tupdesc1);
+ ReleaseTupleDesc(tupdesc2);
+
+ /* Avoid leaking memory when handed toasted input. */
+ PG_FREE_IF_COPY(record1, 0);
+ PG_FREE_IF_COPY(record2, 1);
+
+ PG_RETURN_BOOL(result);
+ }
+
+ Datum
+ record_image_ne(PG_FUNCTION_ARGS)
+ {
+ PG_RETURN_BOOL(!DatumGetBool(record_image_eq(fcinfo)));
+ }
+
+ Datum
+ record_image_lt(PG_FUNCTION_ARGS)
+ {
+ PG_RETURN_BOOL(record_image_cmp(fcinfo) < 0);
+ }
+
+ Datum
+ record_image_gt(PG_FUNCTION_ARGS)
+ {
+ PG_RETURN_BOOL(record_image_cmp(fcinfo) > 0);
+ }
+
+ Datum
+ record_image_le(PG_FUNCTION_ARGS)
+ {
+ PG_RETURN_BOOL(record_image_cmp(fcinfo) <= 0);
+ }
+
+ Datum
+ record_image_ge(PG_FUNCTION_ARGS)
+ {
+ PG_RETURN_BOOL(record_image_cmp(fcinfo) >= 0);
+ }
+
+ Datum
+ btrecordimagecmp(PG_FUNCTION_ARGS)
+ {
+ PG_RETURN_INT32(record_image_cmp(fcinfo));
+ }
*** a/src/include/catalog/pg_amop.h
--- b/src/include/catalog/pg_amop.h
***************
*** 493,498 **** DATA(insert ( 2994 2249 2249 4 s 2993 403 0 ));
--- 493,508 ----
DATA(insert ( 2994 2249 2249 5 s 2991 403 0 ));
/*
+ * btree record_image_ops
+ */
+
+ DATA(insert ( 3194 2249 2249 1 s 3190 403 0 ));
+ DATA(insert ( 3194 2249 2249 2 s 3192 403 0 ));
+ DATA(insert ( 3194 2249 2249 3 s 3188 403 0 ));
+ DATA(insert ( 3194 2249 2249 4 s 3193 403 0 ));
+ DATA(insert ( 3194 2249 2249 5 s 3191 403 0 ));
+
+ /*
* btree uuid_ops
*/
*** a/src/include/catalog/pg_amproc.h
--- b/src/include/catalog/pg_amproc.h
***************
*** 122,127 **** DATA(insert ( 1989 26 26 1 356 ));
--- 122,128 ----
DATA(insert ( 1989 26 26 2 3134 ));
DATA(insert ( 1991 30 30 1 404 ));
DATA(insert ( 2994 2249 2249 1 2987 ));
+ DATA(insert ( 3194 2249 2249 1 3187 ));
DATA(insert ( 1994 25 25 1 360 ));
DATA(insert ( 1996 1083 1083 1 1107 ));
DATA(insert ( 2000 1266 1266 1 1358 ));
*** a/src/include/catalog/pg_opclass.h
--- b/src/include/catalog/pg_opclass.h
***************
*** 143,148 **** DATA(insert ( 405 oid_ops PGNSP PGUID 1990 26 t 0 ));
--- 143,149 ----
DATA(insert ( 403 oidvector_ops PGNSP PGUID 1991 30 t 0 ));
DATA(insert ( 405 oidvector_ops PGNSP PGUID 1992 30 t 0 ));
DATA(insert ( 403 record_ops PGNSP PGUID 2994 2249 t 0 ));
+ DATA(insert ( 403 record_image_ops PGNSP PGUID 3194 2249 f 0 ));
DATA(insert OID = 3126 ( 403 text_ops PGNSP PGUID 1994 25 t 0 ));
#define TEXT_BTREE_OPS_OID 3126
DATA(insert ( 405 text_ops PGNSP PGUID 1995 25 t 0 ));
*** a/src/include/catalog/pg_operator.h
--- b/src/include/catalog/pg_operator.h
***************
*** 1672,1677 **** DESCR("less than or equal");
--- 1672,1691 ----
DATA(insert OID = 2993 ( ">=" PGNSP PGUID b f f 2249 2249 16 2992 2990 record_ge scalargtsel scalargtjoinsel ));
DESCR("greater than or equal");
+ /* byte-oriented tests for identical rows and fast sorting */
+ DATA(insert OID = 3188 ( "===" PGNSP PGUID b t f 2249 2249 16 3188 3189 record_image_eq eqsel eqjoinsel ));
+ DESCR("identical");
+ DATA(insert OID = 3189 ( "!==" PGNSP PGUID b f f 2249 2249 16 3189 3188 record_image_ne neqsel neqjoinsel ));
+ DESCR("not identical");
+ DATA(insert OID = 3190 ( "<<<" PGNSP PGUID b f f 2249 2249 16 3191 3193 record_image_lt scalarltsel scalarltjoinsel ));
+ DESCR("less than");
+ DATA(insert OID = 3191 ( ">>>" PGNSP PGUID b f f 2249 2249 16 3190 3192 record_image_gt scalargtsel scalargtjoinsel ));
+ DESCR("greater than");
+ DATA(insert OID = 3192 ( "<==" PGNSP PGUID b f f 2249 2249 16 3193 3191 record_image_le scalarltsel scalarltjoinsel ));
+ DESCR("less than or equal");
+ DATA(insert OID = 3193 ( ">==" PGNSP PGUID b f f 2249 2249 16 3192 3190 record_image_ge scalargtsel scalargtjoinsel ));
+ DESCR("greater than or equal");
+
/* generic range type operators */
DATA(insert OID = 3882 ( "=" PGNSP PGUID b t t 3831 3831 16 3882 3883 range_eq eqsel eqjoinsel ));
DESCR("equal");
*** a/src/include/catalog/pg_opfamily.h
--- b/src/include/catalog/pg_opfamily.h
***************
*** 96,101 **** DATA(insert OID = 1990 ( 405 oid_ops PGNSP PGUID ));
--- 96,102 ----
DATA(insert OID = 1991 ( 403 oidvector_ops PGNSP PGUID ));
DATA(insert OID = 1992 ( 405 oidvector_ops PGNSP PGUID ));
DATA(insert OID = 2994 ( 403 record_ops PGNSP PGUID ));
+ DATA(insert OID = 3194 ( 403 record_image_ops PGNSP PGUID ));
DATA(insert OID = 1994 ( 403 text_ops PGNSP PGUID ));
#define TEXT_BTREE_FAM_OID 1994
DATA(insert OID = 1995 ( 405 text_ops PGNSP PGUID ));
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
***************
*** 4470,4476 **** DESCR("get set of in-progress txids in snapshot");
DATA(insert OID = 2948 ( txid_visible_in_snapshot PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "20 2970" _null_ _null_ _null_ _null_ txid_visible_in_snapshot _null_ _null_ _null_ ));
DESCR("is txid visible in snapshot?");
! /* record comparison */
DATA(insert OID = 2981 ( record_eq PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "2249 2249" _null_ _null_ _null_ _null_ record_eq _null_ _null_ _null_ ));
DATA(insert OID = 2982 ( record_ne PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "2249 2249" _null_ _null_ _null_ _null_ record_ne _null_ _null_ _null_ ));
DATA(insert OID = 2983 ( record_lt PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "2249 2249" _null_ _null_ _null_ _null_ record_lt _null_ _null_ _null_ ));
--- 4470,4476 ----
DATA(insert OID = 2948 ( txid_visible_in_snapshot PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "20 2970" _null_ _null_ _null_ _null_ txid_visible_in_snapshot _null_ _null_ _null_ ));
DESCR("is txid visible in snapshot?");
! /* record comparison using normal comparison rules */
DATA(insert OID = 2981 ( record_eq PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "2249 2249" _null_ _null_ _null_ _null_ record_eq _null_ _null_ _null_ ));
DATA(insert OID = 2982 ( record_ne PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "2249 2249" _null_ _null_ _null_ _null_ record_ne _null_ _null_ _null_ ));
DATA(insert OID = 2983 ( record_lt PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "2249 2249" _null_ _null_ _null_ _null_ record_lt _null_ _null_ _null_ ));
***************
*** 4480,4485 **** DATA(insert OID = 2986 ( record_ge PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0
--- 4480,4495 ----
DATA(insert OID = 2987 ( btrecordcmp PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 23 "2249 2249" _null_ _null_ _null_ _null_ btrecordcmp _null_ _null_ _null_ ));
DESCR("less-equal-greater");
+ /* record comparison using raw byte images */
+ DATA(insert OID = 3181 ( record_image_eq PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "2249 2249" _null_ _null_ _null_ _null_ record_image_eq _null_ _null_ _null_ ));
+ DATA(insert OID = 3182 ( record_image_ne PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "2249 2249" _null_ _null_ _null_ _null_ record_image_ne _null_ _null_ _null_ ));
+ DATA(insert OID = 3183 ( record_image_lt PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "2249 2249" _null_ _null_ _null_ _null_ record_image_lt _null_ _null_ _null_ ));
+ DATA(insert OID = 3184 ( record_image_gt PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "2249 2249" _null_ _null_ _null_ _null_ record_image_gt _null_ _null_ _null_ ));
+ DATA(insert OID = 3185 ( record_image_le PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "2249 2249" _null_ _null_ _null_ _null_ record_image_le _null_ _null_ _null_ ));
+ DATA(insert OID = 3186 ( record_image_ge PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "2249 2249" _null_ _null_ _null_ _null_ record_image_ge _null_ _null_ _null_ ));
+ DATA(insert OID = 3187 ( btrecordimagecmp PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 23 "2249 2249" _null_ _null_ _null_ _null_ btrecordimagecmp _null_ _null_ _null_ ));
+ DESCR("less-equal-greater based on byte images");
+
/* Extensions */
DATA(insert OID = 3082 ( pg_available_extensions PGNSP PGUID 12 10 100 0 0 f f f f t t s 0 0 2249 "" "{19,25,25}" "{o,o,o}" "{name,default_version,comment}" _null_ pg_available_extensions _null_ _null_ _null_ ));
DESCR("list available extensions");
*** a/src/include/utils/builtins.h
--- b/src/include/utils/builtins.h
***************
*** 631,636 **** extern Datum record_gt(PG_FUNCTION_ARGS);
--- 631,643 ----
extern Datum record_le(PG_FUNCTION_ARGS);
extern Datum record_ge(PG_FUNCTION_ARGS);
extern Datum btrecordcmp(PG_FUNCTION_ARGS);
+ extern Datum record_image_eq(PG_FUNCTION_ARGS);
+ extern Datum record_image_ne(PG_FUNCTION_ARGS);
+ extern Datum record_image_lt(PG_FUNCTION_ARGS);
+ extern Datum record_image_gt(PG_FUNCTION_ARGS);
+ extern Datum record_image_le(PG_FUNCTION_ARGS);
+ extern Datum record_image_ge(PG_FUNCTION_ARGS);
+ extern Datum btrecordimagecmp(PG_FUNCTION_ARGS);
/* ruleutils.c */
extern bool quote_all_identifiers;
*** a/src/test/regress/expected/opr_sanity.out
--- b/src/test/regress/expected/opr_sanity.out
***************
*** 1037,1049 **** ORDER BY 1, 2, 3;
--- 1037,1054 ----
amopmethod | amopstrategy | oprname
------------+--------------+---------
403 | 1 | <
+ 403 | 1 | <<<
403 | 1 | ~<~
403 | 2 | <=
+ 403 | 2 | <==
403 | 2 | ~<=~
403 | 3 | =
+ 403 | 3 | ===
403 | 4 | >=
+ 403 | 4 | >==
403 | 4 | ~>=~
403 | 5 | >
+ 403 | 5 | >>>
403 | 5 | ~>~
405 | 1 | =
783 | 1 | <<
***************
*** 1098,1104 **** ORDER BY 1, 2, 3;
4000 | 15 | >
4000 | 16 | @>
4000 | 18 | =
! (62 rows)
-- Check that all opclass search operators have selectivity estimators.
-- This is not absolutely required, but it seems a reasonable thing
--- 1103,1109 ----
4000 | 15 | >
4000 | 16 | @>
4000 | 18 | =
! (67 rows)
-- Check that all opclass search operators have selectivity estimators.
-- This is not absolutely required, but it seems a reasonable thing
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers