While working on Materialize's streaming logical replication from Postgres [0], my colleagues Sean Loiselle and Petros Angelatos (CC'd) discovered today what appears to be a correctness bug in pgoutput, introduced in v15.
The problem goes like this. A table with REPLICA IDENTITY FULL and some data in it... CREATE TABLE t (a int); ALTER TABLE t REPLICA IDENTITY FULL; INSERT INTO t VALUES (1), (2), (3), ...; ...undergoes a schema change to add a new column with a default: ALTER TABLE t ADD COLUMN b bool DEFAULT false NOT NULL; PostgreSQL is smart and does not rewrite the entire table during the schema change. Instead it updates the tuple description to indicate to future readers of the table that if `b` is missing, it should be filled in with the default value, `false`. Unfortunately, since v15, pgoutput mishandles missing attributes. If a downstream server is subscribed to changes from t via the pgoutput plugin, when a row with a missing attribute is updated, e.g.: UPDATE t SET a = 2 WHERE a = 1 pgoutput will incorrectly report b's value as NULL in the old tuple, rather than false. Using the same example: old: a=1, b=NULL new: a=2, b=true The subscriber will ignore the update (as it has no row with values a=1, b=NULL), and thus the subscriber's copy of `t` will become out of sync with the publisher's. I bisected the problem to 52e4f0cd4 [1], which introduced row filtering for publications. The problem appears to be the use of CreateTupleDescCopy where CreateTupleDescCopyConstr is required, as the former drops the constraints in the tuple description (specifically, the default value constraint) on the floor. I've attached a patch which both fixes the issue and includes a test. I've verified that the test fails against the current master and passes against the patched version. I'm relatively unfamiliar with the project norms here, but assuming the patch is acceptable, this strikes me as important enough to warrant a backport to both v15 and v16. [0]: https://materialize.com/docs/sql/create-source/postgres [1]: https://github.com/postgres/postgres/commit/52e4f0cd472d39d07732b99559989ea3b615be78
0001-pgoutput-don-t-unconditionally-fill-in-missing-value.patch
Description: Binary data