Re: Non-text mode for pg_dumpall

Noah Misch Sat, 06 Jun 2026 17:02:39 -0700

On Thu, Feb 26, 2026 at 09:02:48AM -0500, Andrew Dunstan wrote:
> pushed with a slight tweak.


Having now reviewed commit 763aaa0, I don't think it's ready to remain part of
v19.  While some points from my v18 review are now resolved, other points
still seem unresolved.  I didn't find discussion of the unresolved points.  I
also see new issues.


Wider issues:

> @@ -796,24 +964,45 @@ dropRoles(PGconn *conn)

> +             if (archDumpFormat == archNull)
> +             {
> +                     appendPQExpBuffer(delQry, "DROP ROLE %s%s;\n",
> +                                                       if_exists ? "IF 
> EXISTS " : "",
> +                                                       fmtId(rolename));
> +                     fprintf(OPF, "%s", delQry->data);
> +             }
> +             else
> +             {
> +                     appendPQExpBuffer(delQry, "DROP ROLE IF EXISTS %s;\n",
> +                                                       fmtId(rolename));
> +
> +                     ArchiveEntry(fout,
> +                                              nilCatalogId,  /* catalog ID */
> +                                              createDumpId(),        /* dump 
> ID */
> +                                              ARCHIVE_OPTS(.tag = 
> psprintf("ROLE %s", fmtId(rolename)),
> +                                                                       
> .description = "DROP_GLOBAL",
> +                                                                       
> .section = SECTION_PRE_DATA,
> +                                                                       
> .createStmt = delQry->data));
> +             }

pg_dumpall.c should call ArchiveEntry() in the same way pg_dump.c does.  In
pg_dump.c, per-object-class support code calls ArchiveEntry() unconditionally,
and the object-independent infra of pg_backup_archiver.c deals with the
difference between plain and non-plain formats.

There should be one appendPQExpBuffer(delQry, "DROP ROLE ..."), not one for
plain format and another for non-plain formats.  Having two creates excess
risk of format-specific bugs, something pg_dump.c has long avoided well.  (I'm
echoing my postgr.es/m/[email protected] review.  I wrote,
"The strength of the archiver architecture shows in how rarely new features
need format-specific logic and how rarely format-specific bugs get reported."
That holds for the way pg_dump.c uses the archiver, but it doesn't hold for
the way pg_dumpall.c now uses the archiver.)

Separately, DROP should be in dropStmt, not in createStmt.  This likely
entails refactoring to merge dumpRoles() and dropRoles() into one function,
per the style of pg_dump.c.

> +             /*
> +              * For pg_dumpall archives, --clean implies --if-exists since 
> global
> +              * objects may not exist in the target cluster.
> +              */
> +             if (opts->dropSchema && !opts->if_exists)
> +             {
> +                     opts->if_exists = 1;
> +                     pg_log_info("--if-exists is implied by --clean for 
> pg_dumpall archives");
> +             }

The last comment of postgr.es/m/[email protected] disagreed
with this decision, so that remains unresolved.

> +             /* If database is already created, then don't set createDB 
> flag. */
> +             if (tmpopts->cparams.dbname)
> +             {
> +                     PGconn     *test_conn;
> +
> +                     test_conn = ConnectDatabase(dbidname->str, NULL, 
> tmpopts->cparams.pghost,
> +                                                                             
> tmpopts->cparams.pgport, tmpopts->cparams.username, TRI_DEFAULT,
> +                                                                             
> false, progname, NULL, NULL, NULL, NULL);
> +                     if (test_conn)
> +                     {
> +                             PQfinish(test_conn);
> +
> +                             /* Use already created database for connection. 
> */
> +                             tmpopts->createDB = 0;
> +                             tmpopts->cparams.dbname = dbidname->str;
> +                     }
> +                     else
> +                     {
> +                             /* We'll have to create it */
> +                             tmpopts->createDB = 1;
> +                             tmpopts->cparams.dbname = connected_db;
> +                     }

postgr.es/m/[email protected] called for this to change.

None of this is meant to say the feature is impossible. But I don’t think this
commit is at the point where post-commit fixups are the right workflow.  I
recommend reverting, posting a new version, and letting commitfest review
finish.


Localized issues:

There's new code to write map and globals files, but I don't see corresponding
code to fsync those files, like we fsync a plain-format dump.

> @@ -3027,6 +3049,16 @@ _tocEntryRequired(TocEntry *te, teSection curSection, 
> ArchiveHandle *AH)
>                       return 0;
>       }
>  
> +     /*
> +      * Global object TOC entries (e.g., ROLEs or TABLESPACEs) must not be
> +      * ignored.
> +      */
> +     if (strcmp(te->desc, "ROLE") == 0 ||
> +             strcmp(te->desc, "ROLE PROPERTIES") == 0 ||
> +             strcmp(te->desc, "TABLESPACE") == 0 ||
> +             strcmp(te->desc, "DROP_GLOBAL") == 0)
> +             return REQ_SCHEMA;
> +

If I delete this addition, no src/bin/pg_dump test fails.  Would you explain
the rationale for this addition?

> +                     if (comment_buf->data[0] != '\0')
> +                             ArchiveEntry(fout,
> +                                                      nilCatalogId,  /* 
> catalog ID */
> +                                                      createDumpId(),        
> /* dump ID */
> +                                                      ARCHIVE_OPTS(.tag = 
> tag,
> +                                                                             
>   .description = "COMMENT",
> +                                                                             
>   .section = SECTION_PRE_DATA,
> +                                                                             
>   .createStmt = comment_buf->data));
> +
> +                     if (seclabel_buf->data[0] != '\0')
> +                             ArchiveEntry(fout,
> +                                                      nilCatalogId,  /* 
> catalog ID */
> +                                                      createDumpId(),        
> /* dump ID */
> +                                                      ARCHIVE_OPTS(.tag = 
> tag,
> +                                                                             
>   .description = "SECURITY LABEL",
> +                                                                             
>   .section = SECTION_PRE_DATA,
> +                                                                             
>   .createStmt = seclabel_buf->data));

COMMENT and SECURITY LABEL should use .deps and .nDeps to record a dependency
on the main object.  Representative example from pg_dump.c:

                ArchiveEntry(fout, nilCatalogId, createDumpId(),
                                         ARCHIVE_OPTS(.tag = target->data,
                                                                  .namespace = 
tbinfo->dobj.namespace->dobj.name,
                                                                  .owner = 
tbinfo->rolname,
                                                                  .description 
= "SECURITY LABEL",
                                                                  .section = 
SECTION_NONE,
                                                                  .createStmt = 
query->data,
                                                                  .deps = 
&(tbinfo->dobj.dumpId),
                                                                  .nDeps = 1));

This probably has no user-visible consequences, since "/* Parallel execution
is not supported for global object restoration. */".  Still, best to follow
the standard.

> +             /*
> +              * If this is not a plain format dump, then append dboid and 
> dbname to
> +              * the map.dat file.
> +              */

The first six lines inside the block aren't for the purpose described here.
The one line that is for this purpose has its own comment.  Please delete this
comment.

> +             if (archDumpFormat != archNull)
> +             {
> +                     if (archDumpFormat == archCustom)
> +                             snprintf(dbfilepath, MAXPGPATH, 
> "\"%s\"/\"%s\".dmp", db_subdir, oid);
> +                     else if (archDumpFormat == archTar)
> +                             snprintf(dbfilepath, MAXPGPATH, 
> "\"%s\"/\"%s\".tar", db_subdir, oid);
> +                     else
> +                             snprintf(dbfilepath, MAXPGPATH, 
> "\"%s\"/\"%s\"", db_subdir, oid);

My review in postgr.es/m/[email protected] called for this
to change.  I'm finding no discussion of the rationale for not changing it.

> +
> +                     /* Put one line entry for dboid and dbname in map file. 
> */
> +                     fprintf(map_file, "%s %s\n", oid, dbname);
> +             }

> +runPgDump(const char *dbname, const char *create_opts, char *dbfile)

> +     if (archDumpFormat != archNull)
> +     {
> +             printfPQExpBuffer(&cmd, "\"%s\" %s -f %s %s", pg_dump_bin,
> +                                               pgdumpopts->data, dbfile, 
> create_opts);

A different "-f" argument is already in pgdumpopts.  That works, but it's
untidy.

> +                     pg_fatal("option %s cannot exclude %s when restoring a 
> pg_dumpall archive",
> +                                      "--section", "--pre-data");

s/--pre-data/pre-data/

> +             /*
> +              * Always restore global objects, even if --exclude-database 
> results
> +              * in zero databases to process. If 'globals-only' is set, exit
> +              * immediately.

I think this comment is out of date, because the code doesn't have an exit:

> +              */
> +             snprintf(global_path, MAXPGPATH, "%s/toc.glo", inputFileSpec);
> +
> +             n_errors = restore_global_objects(global_path, tmpopts);
> +
> +             if (globals_only)
> +                     pg_log_info("database restoring skipped because option 
> %s was specified",
> +                                             "-g/--globals-only");
> +             else
> +             {
> +                     /* Now restore all the databases from map.dat */
> +                     n_errors = n_errors + 
> restore_all_databases(inputFileSpec, db_exclude_patterns,
> +                                                                             
>                                 opts, numWorkers);
> +             }
> +
> +             /* Free db pattern list. */
> +             simple_string_list_destroy(&db_exclude_patterns);
> +     }

> +     /* Don't output TOC entry comments when restoring globals */
> +     ((ArchiveHandle *) AH)->noTocComments = 1;

Why not?

> +static int
> +get_dbnames_list_to_restore(PGconn *conn,
> +                                                     SimplePtrList 
> *dbname_oid_list,
> +                                                     SimpleStringList 
> db_exclude_patterns)
> +{
...
> +                     if (pg_strcasecmp(dbidname->str, pat_cell->val) == 0)
> +                             skip_db_restore = true;
> +                     /* Otherwise, try a pattern match if there is a 
> connection */

The code assumes a connection unconditionally (which seems right to me):

> +                     else
> +                     {
> +                             int                     dotcnt;
> +
> +                             appendPQExpBufferStr(query, "SELECT 1 ");
> +                             processSQLNamePattern(conn, query, 
> pat_cell->val, false,
> +                                                                       
> false, NULL, db_lit->data,
> +                                                                       NULL, 
> NULL, NULL, &dotcnt);
> +
> +                             if (dotcnt > 0)
> +                             {
> +                                     pg_log_error("improper qualified name 
> (too many dotted names): %s",
> +                                                              dbidname->str);
> +                                     PQfinish(conn);
> +                                     exit_nicely(1);
> +                             }
> +
> +                             res = executeQuery(conn, query->data);

Re: Non-text mode for pg_dumpall

Reply via email to