In the past, I've read [this post][1] from Marc Millas that reports
`pg_dump ... | psql` at throughput around 500MB/s (5Gb/s) on a 10Gb/s
network.

Today, I've tried pg_dump on a single schema of 25 tables, 88K rows, 70MB
total (as reported by pg total relation sizes). 1 table of 225 rows
contains blobs totaling 48MB of the 70MB by itself. The largest table has
40K rows.

And it takes around 10s (+/- 1s), generating a ~ 100MB dump file (or
directory).
I tried all 4 formats, and even with -jN with N in 2,3,4 for -Fd. Little to
no difference, both in duration and output sizes.
I tried against a v12 and v14 Linux servers (real hardware, server-class
machines, in the $10K range each, 2 years old and 7 years old, running
RedHat 7)
I'm on a 1Gb/s LAN.
The client side is a Windows 11 PRO desktop running 16.1 release builds of
pg_dump (and my own client).

These results mean a throughput of only around 10MB/s, when considering the
output dump, and less when considering the relation sizes (70MB)

Then I tried my own "dump", using COPY BINARY. And got a 58MB dump file in
around 5-6s, with my libpq wrapper reporting metrics of the COPY itself
(i.e. just the libpq COPY APIs used, independent of how I write the results
to disk) in the 10-12MB/s throughput range. So basically not quite twice
faster than pg_dump, with a resulting dump not quite half as small.

The reason I'm writing this is because I'm surprised how slow that is.
There's a chasm between 500MB/s and 10MB/s.
What could explain such slow performance?
Of pg_dump? And my own slightly faster dump?

Also, how come `-Fd -j4` is not any faster than the serial version,
especially when there's a table with few rows and large blobs that is 68%
of the total?

What could be wrong to explain such poor throughput?
Especially given that even pg_dump itself is slow, thus it doesn't appear
to be my code.

I'm concerned with these results, and would appreciate some expert advice.

Thanks, --DD

PS: The PostgreSQL installations are mostly "stock", with little to no
tuning...

[1]:
https://postgrespro.com/list/id/cadx_1aanc4gibvwjjeomuxgxtt9xw8gavrctc4jwgnbspjo...@mail.gmail.com

Reply via email to