On Fri, Jan 05, 2024 at 03:02:34PM -0500, Tom Lane wrote: > On further reflection, there is a very good reason why it's done like > that. Because pg_upgrade is doing schema-only dump and restore, > there's next to no opportunity for parallelism within either pg_dump > or pg_restore. There's no data-loading steps, and there's no > index-building either, so the time-consuming stuff that could be > parallelized just isn't happening in pg_upgrade's usage. > > Now it's true that my 0003 patch moves the needle a little bit: > since it makes BLOB creation (as opposed to loading) parallelizable, > there'd be some hope for parallel pg_restore doing something useful in > a database with very many blobs. But it makes no sense to remove the > existing cross-database parallelism in pursuit of that; you'd make > many more people unhappy than happy.
I assume the concern is that we'd end up multiplying the effective number of workers if we parallelized both in-database and cross-database? Would it be sufficient to make those separately configurable with a note about the multiplicative effects of setting both? I think it'd be unfortunate if pg_upgrade completely missed out on this improvement. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com