pg_upgrade: Use COPY for large object metadata.

Presently, pg_dump generates commands like

    SELECT pg_catalog.lo_create('5432');
    ALTER LARGE OBJECT 5432 OWNER TO alice;
    GRANT SELECT ON LARGE OBJECT 5432 TO bob;

for each large object.  This is particularly slow at restore time,
especially when there are tens or hundreds of millions of large
objects.  From reports and personal experience, such slow restores
seem to be most painful when encountered during pg_upgrade.  This
commit teaches pg_dump to instead dump pg_largeobject_metadata and
the corresponding pg_shdepend rows when in binary upgrade mode,
i.e., pg_dump now generates commands like

    COPY pg_catalog.pg_largeobject_metadata (oid, lomowner, lomacl) FROM stdin;
    5432        16384   {alice=rw/alice,bob=r/alice}
    \.

    COPY pg_catalog.pg_shdepend (dbid, classid, objid, objsubid, refclassid, 
refobjid, deptype) FROM stdin;
    5   2613    5432    0       1260    16384   o
    5   2613    5432    0       1260    16385   a
    \.

Testing indicates the COPY approach can be significantly faster.
To do any better, we'd probably need to find a way to copy/link
pg_largeobject_metadata's files during pg_upgrade, which would be
limited to upgrades from >= v16 (since commit 7b378237aa changed
the storage format for aclitem, which is used for
pg_largeobject_metadata.lomacl).

Note that this change only applies to binary upgrade mode (i.e.,
dumps initiated by pg_upgrade) since it inserts rows directly into
catalogs.  Also, this optimization can only be used for upgrades
from >= v12 because pg_largeobject_metadata was created WITH OIDS
in older versions, which prevents pg_dump from handling
pg_largeobject_metadata.oid properly.  With some extra effort, it
might be possible to support upgrades from older versions, but the
added complexity didn't seem worth it to support versions that will
have been out-of-support for nearly 3 years by the time this change
is released.

Experienced hackers may remember that prior to v12, pg_upgrade
copied/linked pg_largeobject_metadata's files (see commit
12a53c732c).  Besides the aforementioned storage format issues,
this approach failed to transfer the relevant pg_shdepend rows, and
pg_dump still had to generate an lo_create() command per large
object so that creating the dependent comments and security labels
worked.  We could perhaps adopt a hybrid approach for upgrades from
v16 and newer (i.e., generate lo_create() commands for each large
object, copy/link pg_largeobject_metadata's files, and COPY the
relevant pg_shdepend rows), but further testing is needed.

Reported-by: Hannu Krosing <han...@google.com>
Suggested-by: Tom Lane <t...@sss.pgh.pa.us>
Reviewed-by: Hannu Krosing <han...@google.com>
Reviewed-by: Nitin Motiani <nitinmoti...@google.com>
Reviewed-by: Tom Lane <t...@sss.pgh.pa.us>
Discussion: 
https://postgr.es/m/CAMT0RQSS-6qLH%2BzYsOeUbAYhop3wmQTkNmQpo5--QRDUR%2BqYmQ%40mail.gmail.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/161a3e8b682ebb98ea0b9d5015d22990696b99ec

Modified Files
--------------
src/bin/pg_dump/pg_backup_archiver.c | 15 ++++++
src/bin/pg_dump/pg_dump.c            | 90 ++++++++++++++++++++++++++++++++++--
src/bin/pg_dump/t/002_pg_dump.pl     |  4 +-
3 files changed, 103 insertions(+), 6 deletions(-)

Reply via email to