On Thu, Sep 18, 2025 at 5:37 PM R Wahyudi <[email protected]> wrote:
> I've been given a database dump file daily and I've been asked to restore > it. > I tried everything I could to speed up the process, including using -j 40. > > I discovered that at the later stage of the restore process, the > following behaviour repeated a few times : > 40 x pg_restore process doing 100% CPU > Threads are not magic. IO and memory limitations still exist. > 40 x postgres process doing COPY but using 0% CPU > ..... and zero disk write activity > > I don't see this behaviour when restoring the database that was dumped > with -Fd. > Also with an un-piped backup file, I can restore a specific table without > having to wait for hours. > We explained this three days ago. Heck, it's in this very email. Click on "the three dots", scroll down a bit. > On Fri, 19 Sept 2025 at 01:54, Adrian Klaver <[email protected]> > wrote: > >> On 9/18/25 05:58, R Wahyudi wrote: >> > Hi All, >> > >> > Thanks for the quick and accurate response! I never been so happy >> > seeing IOwait on my system! >> >> Because? >> >> What did you find? >> >> > >> > I might be blind as I can't find information about 'offset' in pg_dump >> > documentation. >> > Where can I find more info about this? >> >> It is not in the user documentation. >> >> From the thread Ron referred to, there is an explanation here: >> >> https://www.postgresql.org/message-id/366773.1756749256%40sss.pgh.pa.us >> >> I believe the actual code, for the -Fc format, is in pg_backup_custom.c >> here: >> >> >> https://github.com/postgres/postgres/blob/master/src/bin/pg_dump/pg_backup_custom.c#L723 >> >> Per comment at line 755: >> >> " >> If possible, re-write the TOC in order to update the data offset >> information. This is not essential, as pg_restore can cope in most >> cases without it; but it can make pg_restore significantly faster >> in some situations (especially parallel restore). We can skip this >> step if we're not dumping any data; there are no offsets to update >> in that case. >> " >> >> > >> > Regards, >> > Rianto >> > >> > On Wed, 17 Sept 2025 at 13:48, Ron Johnson <[email protected] >> > <mailto:[email protected]>> wrote: >> > >> > >> > PG 17 has integrated zstd compression, while --format=directory lets >> > you do multi-threaded dumps. That's much faster than a single- >> > threaded pg_dump into a multi-threaded compression program. >> > >> > (If for _Reasons_ you require a single-file backup, then tar the >> > directory of compressed files using the --remove-files option.) >> > >> > On Tue, Sep 16, 2025 at 10:50 PM R Wahyudi <[email protected] >> > <mailto:[email protected]>> wrote: >> > >> > Sorry for not including the full command - yes , its piping to a >> > compression command : >> > | lbzip2 -n <threadsforbzipgoeshere>--best > >> <filenamegoeshere> >> > >> > >> > I think we found the issue! I'll do further testing and see how >> > it goes ! >> > >> > >> > >> > >> > >> > On Wed, 17 Sept 2025 at 11:02, Ron Johnson >> > <[email protected] <mailto:[email protected]>> >> wrote: >> > >> > So, piping or redirecting to a file? If so, then that's the >> > problem. >> > >> > pg_dump directly to a file puts file offsets in the TOC. >> > >> > This how I do custom dumps: >> > cd $BackupDir >> > pg_dump -Fc --compress=zstd:long -v -d${db} -f ${db}.dump >> > 2> ${db}.log >> > >> > On Tue, Sep 16, 2025 at 8:54 PM R Wahyudi >> > <[email protected] <mailto:[email protected]>> wrote: >> > >> > pg_dump was done using the following command : >> > pg_dump -Fc -Z 0 -h <host> -U <user> -w -d <database> >> > >> > On Wed, 17 Sept 2025 at 08:36, Adrian Klaver >> > <[email protected] >> > <mailto:[email protected]>> wrote: >> > >> > On 9/16/25 15:25, R Wahyudi wrote: >> > > >> > > I'm trying to troubleshoot the slowness issue >> > with pg_restore and >> > > stumbled across a recent post about pg_restore >> > scanning the whole file : >> > > >> > > > "scanning happens in a very inefficient way, >> > with many seek calls and >> > > small block reads. Try strace to see them. This >> > initial phase can take >> > > hours in a huge dump file, before even starting >> > any actual restoration." >> > > see : https://www.postgresql.org/message-id/ >> > E48B611D-7D61-4575-A820- <https:// >> > >> www.postgresql.org/message-id/E48B611D-7D61-4575-A820-> >> > > B2C3EC2E0551%40gmx.net <http://40gmx.net> >> > <https://www.postgresql.org/message-id/ <https:// >> > www.postgresql.org/message-id/> >> > > E48B611D-7D61-4575-A820-B2C3EC2E0551%40gmx.net >> > <http://40gmx.net>> >> > >> > This was for pg_dump output that was streamed to a >> > Borg archive and as >> > result had no object offsets in the TOC. >> > >> > How are you doing your pg_dump? >> > >> > >> > >> > -- >> > Adrian Klaver >> > [email protected] >> > <mailto:[email protected]> >> > >> > >> > >> > -- >> > Death to <Redacted>, and butter sauce. >> > Don't boil me, I'm still alive. >> > <Redacted> lobster! >> > >> > >> > >> > -- >> > Death to <Redacted>, and butter sauce. >> > Don't boil me, I'm still alive. >> > <Redacted> lobster! >> > >> >> >> -- >> Adrian Klaver >> [email protected] >> > -- Death to <Redacted>, and butter sauce. Don't boil me, I'm still alive. <Redacted> lobster!
