"Heikki Linnakangas" <[EMAIL PROTECTED]> writes: > At some point, I think we have to bite the bullet and find a way to use > multiple CPUs for a single load. I don't have any good ideas or plans > for that, but hopefully someone does.
As already mentioned upthread, we could do that today, with zero backend changes, by making pg_restore drive multiple sessions. Now there are scenarios where this wouldn't help too much --- eg, a database with only one enormous table. We couldn't parallelize the loading of that table, although we could parallelize creation of its indexes. But for an example such as JD is complaining about, we ought to have no trouble thrashing his disks into the ground ;-) What you would need for this is an -Fc or -Ft dump, because a plain script output from pg_dump doesn't carry any dependency information, much less any index of where in the file different bits of data are. Just armwaving, I envision a multiprocess pg_restore like this: * one controller process that tracks the not-yet-loaded TOC items in the dump * N child processes that just transmit a selected TOC item to a connected backend, and then ask the controller what to do next The controller would need to use the dependency information to avoid, eg, handing out a CREATE INDEX command before the parent table was created and loaded. One issue is that this couldn't use "-1" single-transaction restoring, since obviously each child would need its own transaction, and furthermore would have to commit before going back to the controller for more work (since dependent TOC items might well get loaded by a different child later). That defeats a couple of optimizations that Simon put in recently. The one for no XLOG during COPY is not too hard to see how to re-enable, but I'm not sure what else there was. Most likely, the bottleneck with this sort of thing would be multiple parallel reads from the pg_dump archive file. Possibly the controller process could be taught to schedule COPY and CREATE INDEX operations so that not too many processes are trying to read lots of archive data at the same time. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq