On 9/18/25 2:36 PM, R Wahyudi wrote:
I've been given a database dump file daily and I've been asked to
restore it.
I tried everything I could to speed up the process, including using -j 40.
I discovered that at the later stage of the restore process, the
following behaviour repeated a few times :
40 x pg_restore process doing 100% CPU
40 x postgres process doing COPY but using 0% CPU
..... and zero disk write activity
I don't see this behaviour when restoring the database that was dumped
with -Fd.
Also with an un-piped backup file, I can restore a specific table
without having to wait for hours.
From the docs:
https://www.postgresql.org/docs/current/app-pgrestore.html
"
-j number-of-jobs
Only the custom and directory archive formats are supported with this
option. The input must be a regular file or directory (not, for example,
a pipe or standard input). Also, multiple jobs cannot be used together
with the option --single-transaction.
"
--
On Fri, 19 Sept 2025 at 01:54, Adrian Klaver <[email protected]
<mailto:[email protected]>> wrote:
On 9/18/25 05:58, R Wahyudi wrote:
> Hi All,
>
> Thanks for the quick and accurate response! I never been so happy
> seeing IOwait on my system!
Because?
What did you find?
>
> I might be blind as I can't find information about 'offset' in
pg_dump
> documentation.
> Where can I find more info about this?
It is not in the user documentation.
From the thread Ron referred to, there is an explanation here:
https://www.postgresql.org/message-
id/366773.1756749256%40sss.pgh.pa.us <https://www.postgresql.org/
message-id/366773.1756749256%40sss.pgh.pa.us>
I believe the actual code, for the -Fc format, is in pg_backup_custom.c
here:
https://github.com/postgres/postgres/blob/master/src/bin/pg_dump/
pg_backup_custom.c#L723 <https://github.com/postgres/postgres/blob/
master/src/bin/pg_dump/pg_backup_custom.c#L723>
Per comment at line 755:
"
If possible, re-write the TOC in order to update the data offset
information. This is not essential, as pg_restore can cope in most
cases without it; but it can make pg_restore significantly faster
in some situations (especially parallel restore). We can skip this
step if we're not dumping any data; there are no offsets to update
in that case.
"
>
> Regards,
> Rianto
>
> On Wed, 17 Sept 2025 at 13:48, Ron Johnson
<[email protected] <mailto:[email protected]>
> <mailto:[email protected]
<mailto:[email protected]>>> wrote:
>
>
> PG 17 has integrated zstd compression, while --
format=directory lets
> you do multi-threaded dumps. That's much faster than a single-
> threaded pg_dump into a multi-threaded compression program.
>
> (If for _Reasons_ you require a single-file backup, then tar the
> directory of compressed files using the --remove-files option.)
>
> On Tue, Sep 16, 2025 at 10:50 PM R Wahyudi
<[email protected] <mailto:[email protected]>
> <mailto:[email protected] <mailto:[email protected]>>> wrote:
>
> Sorry for not including the full command - yes , its
piping to a
> compression command :
> | lbzip2 -n <threadsforbzipgoeshere>--best >
<filenamegoeshere>
>
>
> I think we found the issue! I'll do further testing and
see how
> it goes !
>
>
>
>
>
> On Wed, 17 Sept 2025 at 11:02, Ron Johnson
> <[email protected] <mailto:[email protected]>
<mailto:[email protected] <mailto:[email protected]>>>
wrote:
>
> So, piping or redirecting to a file? If so, then
that's the
> problem.
>
> pg_dump directly to a file puts file offsets in the TOC.
>
> This how I do custom dumps:
> cd $BackupDir
> pg_dump -Fc --compress=zstd:long -v -d${db} -f ${db}.dump
> 2> ${db}.log
>
> On Tue, Sep 16, 2025 at 8:54 PM R Wahyudi
> <[email protected] <mailto:[email protected]>
<mailto:[email protected] <mailto:[email protected]>>> wrote:
>
> pg_dump was done using the following command :
> pg_dump -Fc -Z 0 -h <host> -U <user> -w -d <database>
>
> On Wed, 17 Sept 2025 at 08:36, Adrian Klaver
> <[email protected]
<mailto:[email protected]>
> <mailto:[email protected]
<mailto:[email protected]>>> wrote:
>
> On 9/16/25 15:25, R Wahyudi wrote:
> >
> > I'm trying to troubleshoot the slowness issue
> with pg_restore and
> > stumbled across a recent post about pg_restore
> scanning the whole file :
> >
> > > "scanning happens in a very inefficient
way,
> with many seek calls and
> > small block reads. Try strace to see them.
This
> initial phase can take
> > hours in a huge dump file, before even
starting
> any actual restoration."
> > see : https://www.postgresql.org/message-
id/ <https://www.postgresql.org/message-id/>
> E48B611D-7D61-4575-A820- <https://
> www.postgresql.org/message-id/E48B611D-7D61-4575-A820- <http://
www.postgresql.org/message-id/E48B611D-7D61-4575-A820->>
> > B2C3EC2E0551%40gmx.net <http://40gmx.net>
<http://40gmx.net <http://40gmx.net>>
> <https://www.postgresql.org/message-id/
<https://www.postgresql.org/message-id/> <https://
> www.postgresql.org/message-id/ <http://www.postgresql.org/
message-id/>>
> > E48B611D-7D61-4575-A820-
B2C3EC2E0551%40gmx.net <http://40gmx.net>
> <http://40gmx.net <http://40gmx.net>>>
>
> This was for pg_dump output that was streamed
to a
> Borg archive and as
> result had no object offsets in the TOC.
>
> How are you doing your pg_dump?
>
>
>
> --
> Adrian Klaver
> [email protected] <mailto:[email protected]>
> <mailto:[email protected]
<mailto:[email protected]>>
>
>
>
> --
> Death to <Redacted>, and butter sauce.
> Don't boil me, I'm still alive.
> <Redacted> lobster!
>
>
>
> --
> Death to <Redacted>, and butter sauce.
> Don't boil me, I'm still alive.
> <Redacted> lobster!
>
--
Adrian Klaver
[email protected] <mailto:[email protected]>
--
Adrian Klaver
[email protected]