Re: pg_upgrade: transfer pg_largeobject_metadata's files when possible

Andres Freund Thu, 05 Feb 2026 12:23:59 -0800

Hi,

On 2026-02-05 13:31:23 -0600, Nathan Bossart wrote:
> On Thu, Feb 05, 2026 at 01:02:17PM -0500, Andres Freund wrote:
> > Upthread I also wondering why we do all the work in getLOs() if we don't
> > actually need most of it (only if there are comments or labels).  Right now
> > that's a very slow and very memory intensive part of doing an upgrade of a
> > system with a lot of binary upgrades.  Do we need *any* of that if we go the
> > path you suggest?
>
> AFAICT we only need it for the comments and security labels later on.


And in binary upgrade mode not even for that, if we do the thing we talked
about re not checking references in binary upgrade mode? Right?


> Commit a45c78e3 did batch 1000 large objects into each ArchiveEntry, but of
> course there can still be a ton of entries.

Entries and then also the PGresult (almost as large as all the entries). The
peak memory usage is pretty bad. We could address the PGresult memory usage
with PQsetChunkedRowsMode() or explicit cursor use, but it doesn't seem
entirely trivial to combine with the batching.


> In theory, we could update the pg_largeobject_metadata query to only
> retrieve LOs with comments and security labels.  I'm not sure it's worth
> trying to optimize further than that; we've long operated under the
> assumption that comments/seclabels on LOs are pretty rare.

I think that'd be a huge improvement. Right now it's not hard to get into a
situation where you have too many LOs to not have enough memory to do a
pg_upgrade.

Memory usage aside, it's also slow and expensive from the query execution and
data transfer side. Because of the ORDER BY that the batching requires, the
server needs to sort all of pg_largeobject_metadata before any rows can be
returned.

For 5M LOs:
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                                  QUERY PLAN   
                                                                │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Sort  (cost=715978.34..728478.39 rows=5000020 width=72) (actual 
time=10292.252..10652.950 rows=5000020.00 loops=1)                            │
│   Sort Key: pg_largeobject_metadata.lomowner, 
((pg_largeobject_metadata.lomacl)::text), pg_largeobject_metadata.oid           
                │
│   Sort Method: quicksort  Memory: 509110kB                                    
                                                                │
│   ->  Seq Scan on pg_largeobject_metadata  (cost=0.00..159638.55 rows=5000020 
width=72) (actual time=0.034..2284.442 rows=5000020.00 loops=1) │
│         SubPlan expr_1                                                        
                                                                │
│           ->  Result  (cost=0.00..0.01 rows=1 width=32) (actual 
time=0.000..0.000 rows=1.00 loops=5000020)                                    │
│ Planning Time: 0.117 ms                                                       
                                                                │
│ Serialization: time=3961.343 ms  output=218686kB  format=text                 
                                                                │
│ Execution Time: 14930.747 ms                                                  
                                                                │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(9 rows)

/usr/bin/time -v pg_dump --binary-upgrade --no-data --quote-all-identifiers 
--no-statistics --format=custom -f /tmp/dump lo_5m

        Command being timed: "pg_dump --binary-upgrade --no-data 
--quote-all-identifiers --no-statistics --format=custom -f /tmp/dump lo_5m"
        User time (seconds): 1.85
        System time (seconds): 0.54
        Percent of CPU this job got: 16%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:14.55
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 935084
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 133379
        Voluntary context switches: 30486
        Involuntary context switches: 6
        Swaps: 0
        File system inputs: 0
        File system outputs: 16
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

Peak memory usage ~1GB.


And unfortunately 5M LOs is not a whole lot.


Filtering the LOs in the query to only return ones that have a comment / label
would typically make the query much faster and pg_dump consume a lot less
memory.


Greetings,

Andres Freund

Re: pg_upgrade: transfer pg_largeobject_metadata's files when possible

Reply via email to