Hi Andrew, Thanks for your review!
My opinion is that this is a bugfix rather than a feature: Rather than adding new capability it's fixing pg_upgrade's behaviour, because it currently fails in the described scenario (SRID-constrained columns). The new code path isn't user facing and it only fires during pg_upgrade's internal use of pg_dump --binary-upgrade. However, I will defer to the committers' judgement on whether this should be included in PG19 and backpatched. To address your feedback, please find attached v2 which: 1. Removes dumpExtensionData() and adds the handling for EXTENSION DATA object type to dumpTableData() 2. Adds test in test_pg_dump: we insert a row into the dumpable extension table, and we expect that the COPY appears in --binary-upgrade dumps. I have also added a commitfest entry. Thanks again, Jimmy On Sun, Mar 29, 2026 at 7:34 PM Andrew Dunstan <[email protected]> wrote: > > On 2026-03-20 Fr 1:47 PM, Jimmy Angelakos wrote: > > Hi All, > > > > I ran into this issue when pg_upgrade-ing a DB with PostGIS. This is > > my first code patch, so any feedback on the approach will be appreciated! > > > > The problem: > > ============ > > pg_upgrade uses pg_dump --schema-only --binary-upgrade to copy the > > schema from $oldcluster to $newcluster. Because this excludes all > > table data, it leaves out data in extension config tables registered > > with pg_extension_config_dump(). > > > > In $newcluster, binary_upgrade_create_empty_extension() creates the > > extensions without populating any table data. The extensions' CREATE > > EXTENSION scripts never get executed so any INSERTs are skipped. As a > > consequence, if any CREATE TABLE statement in $newcluster requires > > validation against these empty config tables, the upgrade fails. As an > > example, > > PostGIS registers config table spatial_ref_sys to hold ~8500 spatial > > reference system definitions (SRIDs). When a table has, e.g. a > > geometry column that specifies an SRID, this gets validated during the > > CREATE TABLE: > > > > CREATE TABLE points (id int, location geometry(Point, 27700)); > > ERROR: Cannot find SRID (27700) in spatial_ref_sys > > > > This will happen for any SRID-constrained column, which will prevent > > many real-world PostGIS deployments from being able to pg_upgrade. To > > summarise the problem, our ordering is wrong here because extension > > configuration data must be present before user tables that depend on > > it get created, but --schema-only strips this data. > > > > The patch: > > ========== > > We are adding a new dump object type DO_EXTENSION_DATA that dumps > > extension config table data in SECTION_PRE_DATA during > > --binary-upgrade ONLY. This restores the needed data between extension > > creation and user object creation, allowing the DDL to succeed. > > > > Four files are modified in bin/pg_dump: > > > > pg_dump.h: > > Add DO_EXTENSION_DATA to the DumpableObjectType enum, between > > DO_EXTENSION and DO_TYPE > > > > pg_dump_sort.c: > > Add PRIO_EXTENSION_DATA between PRIO_EXTENSION and PRIO_TYPE > > > > pg_dump.c: > > 1. Add makeExtensionDataInfo() to create a TableDataInfo with objType > > = DO_EXTENSION_DATA. Called for plain tables (RELKIND_RELATION) during > > --binary-upgrade ONLY. As it depends on the table def, the COPY will > > be emitted after the CREATE TABLE. > > 2. Add dumpExtensionData() to emit the entry in SECTION_PRE_DATA with > > description "EXTENSION DATA" using dumpTableData_copy(). This allows > > the config table data to go into the schema-only dump. > > 3. In processExtensionTables(), when dopt->binary_upgrade is true, > > call makeExtensionDataInfo() instead of makeTableDataInfo(). > > Additionally, skip extcondition filter because we need to dump all > > rows here. > > 4. Include DO_EXTENSION_DATA in pre-data boundary in > > addBoundaryDependencies() > > > > pg_backup_archiver.c: > > Add "EXTENSION DATA" to the whitelist in _tocEntryRequired() similar > > to BLOB, BLOB METADATA, etc. to include extension config table data in > > --schema-only dumps during --binary-upgrade ONLY. > > > > What ends up happening: > > ======================= > > The inserted rows are basically scaffolding to allow the upgrade, and > > do not persist. The pg_upgrade sequence goes like: > > 1. pg_dump includes $oldcluster extension config data in schema-only dump > > 2. pg_restore replays the dump into $newcluster and "EXTENSION DATA" > > entries populate tables like spatial_ref_sys with COPY. Subsequent > > CREATE TABLEs with e.g. SRID-constrained columns pass validation. > > 3. pg_upgrade transfers all data files from $oldcluster to > > $newcluster, making spatial_ref_sys byte-for-byte identical to its > > previous state. > > > > This patch: > > 1. Does NOT affect normal pg_dumps (without --binary-upgrade). > > DO_EXTENSION_DATA objects are not created in this case. > > 2. Leaves binary_upgrade_create_empty_extension() unchanged. > > 3. Is not PostGIS-specific, and should solve this class of problem for > > any extension that registers config tables that will be needed for DDL > > validation. > > 4. Has been tested against HEAD at 29bf4ee7496 with $oldcluster > > PostGIS 3.3.9 on PG14 and $newcluster PostGIS 3.7.0dev/master on > > PG19-devel. > > > > Thanks in advance for your review! Please find attached the patch for > > HEAD. I believe this should be easily backpatchable to (at least) > > PG15, and will be happy to work on backports. > > > Hi, Jimmy. > > First, as you probably know, we don't backpatch features, and I think > this comes into that category. Unfortunately, we're about to close > release 19 for features, so this would need to wait till release 20. > > The patch didn't include any tests. It will need them (probably in > src/test/modules/test_pg_dump) > > There appears to be a lot of code duplication between > dumpExtensionData() and dumpTableData(). It might be better to refactor > that, perhaps by supplying an extra flag to dumpTableData(). > > Do make sure to add a Commitfest entry for this is you haven't already > done so. > > > cheers > > > andrew > > > > > > > > > > > Best regards, > > Jimmy > > -- > Andrew Dunstan > EDB: https://www.enterprisedb.com > > -- Jimmy Angelakos Staff Software Engineer [email protected] pgEdge.com <http://pgedge.com/>
From 46ab846425e8ae62f3cc4dec6b88799c217a20f8 Mon Sep 17 00:00:00 2001 From: Jimmy Angelakos <[email protected]> Date: Fri, 20 Mar 2026 16:04:45 +0000 Subject: [PATCH v2] pg_dump: Restore extension config table data before user objects during binary upgrade pg_upgrade uses pg_dump --schema-only --binary-upgrade, which excludes all table data including extension configuration tables registered via pg_extension_config_dump(). Since binary_upgrade_create_empty_extension() does not populate these tables, any user table whose CREATE TABLE triggers validation against config data will fail. For example, PostGIS tables with SRID-constrained geometry/geography columns fail because spatial_ref_sys is empty during schema restore. Fix by introducing a new dump object type DO_EXTENSION_DATA that dumps extension config table data into SECTION_PRE_DATA during binary upgrade. This puts the data restore between extension creation and user object creation, allowing DDL-time validation to succeed. The data is scaffolding: it is overwritten when pg_upgrade transfers the old cluster's data files to the new cluster. This is not PostGIS-specific and applies to any extension that registers config tables via pg_extension_config_dump() where that data is needed for DDL-time validation. --- src/bin/pg_dump/pg_backup_archiver.c | 2 + src/bin/pg_dump/pg_dump.c | 82 ++++++++++++++++++- src/bin/pg_dump/pg_dump.h | 1 + src/bin/pg_dump/pg_dump_sort.c | 7 ++ src/test/modules/test_pg_dump/t/001_base.pl | 1 - .../test_pg_dump/test_pg_dump--1.0.sql | 1 + 6 files changed, 89 insertions(+), 5 deletions(-) diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c index fecf6f2d1ce..8b2acbe4936 100644 --- a/src/bin/pg_dump/pg_backup_archiver.c +++ b/src/bin/pg_dump/pg_backup_archiver.c @@ -3310,6 +3310,7 @@ _tocEntryRequired(TocEntry *te, teSection curSection, ArchiveHandle *AH) */ if (strcmp(te->desc, "SEQUENCE SET") == 0 || strcmp(te->desc, "BLOB") == 0 || + strcmp(te->desc, "EXTENSION DATA") == 0 || strcmp(te->desc, "BLOB METADATA") == 0 || (strcmp(te->desc, "ACL") == 0 && strncmp(te->tag, "LARGE OBJECT", 12) == 0) || @@ -3351,6 +3352,7 @@ _tocEntryRequired(TocEntry *te, teSection curSection, ArchiveHandle *AH) if (!(ropt->sequence_data && strcmp(te->desc, "SEQUENCE SET") == 0) && !(ropt->binary_upgrade && (strcmp(te->desc, "BLOB") == 0 || + strcmp(te->desc, "EXTENSION DATA") == 0 || strcmp(te->desc, "BLOB METADATA") == 0 || (strcmp(te->desc, "ACL") == 0 && strncmp(te->tag, "LARGE OBJECT", 12) == 0) || diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c index d34240073bb..e3cd4c8e8d4 100644 --- a/src/bin/pg_dump/pg_dump.c +++ b/src/bin/pg_dump/pg_dump.c @@ -352,6 +352,7 @@ static void addConstrChildIdxDeps(DumpableObject *dobj, const IndxInfo *refidx); static void getDomainConstraints(Archive *fout, TypeInfo *tyinfo); static void getTableData(DumpOptions *dopt, TableInfo *tblinfo, int numTables, char relkind); static void makeTableDataInfo(DumpOptions *dopt, TableInfo *tbinfo); +static void makeExtensionDataInfo(DumpOptions *dopt, TableInfo *tbinfo); static void buildMatViewRefreshDependencies(Archive *fout); static void getTableDataFKConstraints(void); static void determineNotNullFlags(Archive *fout, PGresult *res, int r, @@ -2864,6 +2865,8 @@ dumpTableData(Archive *fout, const TableDataInfo *tdinfo) char *tdDefn = NULL; char *copyStmt; const char *copyFrom; + const char *description = "TABLE DATA"; + teSection section = SECTION_DATA; /* We had better have loaded per-column details about this table */ Assert(tbinfo->interesting); @@ -2910,6 +2913,16 @@ dumpTableData(Archive *fout, const TableDataInfo *tdinfo) copyStmt = NULL; } + /* + * Extension config table data goes into SECTION_PRE_DATA so it is + * available before user tables that may need it for validation. + */ + if (tdinfo->dobj.objType == DO_EXTENSION_DATA) + { + description = "EXTENSION DATA"; + section = SECTION_PRE_DATA; + } + /* * Note: although the TableDataInfo is a full DumpableObject, we treat its * dependency on its table as "special" and pass it to ArchiveEntry now. @@ -2923,8 +2936,8 @@ dumpTableData(Archive *fout, const TableDataInfo *tdinfo) ARCHIVE_OPTS(.tag = tbinfo->dobj.name, .namespace = tbinfo->dobj.namespace->dobj.name, .owner = tbinfo->rolname, - .description = "TABLE DATA", - .section = SECTION_DATA, + .description = description, + .section = section, .createStmt = tdDefn, .copyStmt = copyStmt, .deps = &(tbinfo->dobj.dumpId), @@ -3105,6 +3118,48 @@ makeTableDataInfo(DumpOptions *dopt, TableInfo *tbinfo) tbinfo->interesting = true; } +/* + * makeExtensionDataInfo --- create TableDataInfo for extension config table + * + * This is used during binary upgrades to ensure extension configuration + * table data is dumped early (before user tables that may depend on it). + * For example, PostGIS's spatial_ref_sys must be populated before any + * table with geometry(Point, 27700) can be created due to SRID validation. + */ +static void +makeExtensionDataInfo(DumpOptions *dopt, TableInfo *tbinfo) +{ + TableDataInfo *tdinfo; + + /* Already have a data object? */ + if (tbinfo->dataObj != NULL) + return; + + /* + * Caller ensures that this is only called for RELKIND_RELATION. + */ + + /* OK, create the data object */ + tdinfo = (TableDataInfo *) pg_malloc(sizeof(TableDataInfo)); + + tdinfo->dobj.objType = DO_EXTENSION_DATA; + + tdinfo->dobj.catId.tableoid = 0; + tdinfo->dobj.catId.oid = tbinfo->dobj.catId.oid; + AssignDumpId(&tdinfo->dobj); + tdinfo->dobj.name = tbinfo->dobj.name; + tdinfo->dobj.namespace = tbinfo->dobj.namespace; + tdinfo->tdtable = tbinfo; + tdinfo->filtercond = NULL; + addObjectDependency(&tdinfo->dobj, tbinfo->dobj.dumpId); + + /* Mark that this object contains data */ + tdinfo->dobj.components |= DUMP_COMPONENT_DATA; + + tbinfo->dataObj = tdinfo; + tbinfo->interesting = true; +} + /* * The refresh for a materialized view must be dependent on the refresh for * any materialized view that this one is dependent on. @@ -11838,6 +11893,9 @@ dumpDumpableObject(Archive *fout, DumpableObject *dobj) case DO_EXTENSION: dumpExtension(fout, (const ExtensionInfo *) dobj); break; + case DO_EXTENSION_DATA: + dumpTableData(fout, (const TableDataInfo *) dobj); + break; case DO_TYPE: dumpType(fout, (const TypeInfo *) dobj); break; @@ -20393,10 +20451,25 @@ processExtensionTables(Archive *fout, ExtensionInfo extinfo[], if (dumpobj) { - makeTableDataInfo(dopt, configtbl); + /* + * For binary upgrades, dump extension config table data + * before user tables are created so it's available for + * validation (e.g. PostGIS SRIDs). + */ + if (dopt->binary_upgrade && + configtbl->relkind == RELKIND_RELATION) + makeExtensionDataInfo(dopt, configtbl); + else + makeTableDataInfo(dopt, configtbl); if (configtbl->dataObj != NULL) { - if (strlen(extconditionarray[j]) > 0) + /* + * For binary upgrade (DO_EXTENSION_DATA), don't apply + * the filter condition - we need ALL data since the + * extension won't populate built-in data in binary + * upgrade mode. + */ + if (strlen(extconditionarray[j]) > 0 && !dopt->binary_upgrade) configtbl->dataObj->filtercond = pg_strdup(extconditionarray[j]); } } @@ -20674,6 +20747,7 @@ addBoundaryDependencies(DumpableObject **dobjs, int numObjs, { case DO_NAMESPACE: case DO_EXTENSION: + case DO_EXTENSION_DATA: case DO_TYPE: case DO_SHELL_TYPE: case DO_FUNC: diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h index 5a6726d8b12..1e8bb961e8d 100644 --- a/src/bin/pg_dump/pg_dump.h +++ b/src/bin/pg_dump/pg_dump.h @@ -40,6 +40,7 @@ typedef enum /* When modifying this enum, update priority tables in pg_dump_sort.c! */ DO_NAMESPACE, DO_EXTENSION, + DO_EXTENSION_DATA, /* extension config table data for binary upgrade */ DO_TYPE, DO_SHELL_TYPE, DO_FUNC, diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c index 03e5c1c1116..3cced9c27be 100644 --- a/src/bin/pg_dump/pg_dump_sort.c +++ b/src/bin/pg_dump/pg_dump_sort.c @@ -58,6 +58,7 @@ enum dbObjectTypePriorities PRIO_COLLATION, PRIO_TRANSFORM, PRIO_EXTENSION, + PRIO_EXTENSION_DATA, /* ext config data: used for binary upgrade */ PRIO_TYPE, /* used for DO_TYPE and DO_SHELL_TYPE */ PRIO_CAST, PRIO_FUNC, @@ -106,6 +107,7 @@ static const int dbObjectTypePriority[] = { [DO_NAMESPACE] = PRIO_NAMESPACE, [DO_EXTENSION] = PRIO_EXTENSION, + [DO_EXTENSION_DATA] = PRIO_EXTENSION_DATA, [DO_TYPE] = PRIO_TYPE, [DO_SHELL_TYPE] = PRIO_TYPE, [DO_FUNC] = PRIO_FUNC, @@ -1525,6 +1527,11 @@ describeDumpableObject(DumpableObject *obj, char *buf, int bufsize) "EXTENSION %s (ID %d OID %u)", obj->name, obj->dumpId, obj->catId.oid); return; + case DO_EXTENSION_DATA: + snprintf(buf, bufsize, + "EXTENSION DATA %s (ID %d OID %u)", + obj->name, obj->dumpId, obj->catId.oid); + return; case DO_TYPE: snprintf(buf, bufsize, "TYPE %s (ID %d OID %u)", diff --git a/src/test/modules/test_pg_dump/t/001_base.pl b/src/test/modules/test_pg_dump/t/001_base.pl index 3d65ce4497a..db1cb22a13b 100644 --- a/src/test/modules/test_pg_dump/t/001_base.pl +++ b/src/test/modules/test_pg_dump/t/001_base.pl @@ -515,7 +515,6 @@ my %tests = ( extension_schema => 1, }, unlike => { - binary_upgrade => 1, exclude_table => 1, exclude_extension => 1, exclude_extension_filter => 1, diff --git a/src/test/modules/test_pg_dump/test_pg_dump--1.0.sql b/src/test/modules/test_pg_dump/test_pg_dump--1.0.sql index 1c68e146d91..134743e3943 100644 --- a/src/test/modules/test_pg_dump/test_pg_dump--1.0.sql +++ b/src/test/modules/test_pg_dump/test_pg_dump--1.0.sql @@ -18,6 +18,7 @@ CREATE TABLE regress_table_dumpable ( col1 int check (col1 > 0) ); SELECT pg_catalog.pg_extension_config_dump('regress_table_dumpable', ''); +INSERT INTO regress_table_dumpable VALUES (27700); GRANT SELECT ON regress_table_dumpable TO public; CREATE SCHEMA regress_pg_dump_schema; -- 2.51.0
