On Fri, May 25, 2007 at 05:28:43PM -0400, Tom Lane wrote: > That's true at the level of DDL operations, but AFAIK we could > parallelize table-loading and index-creation steps pretty effectively > --- and that's where all the time goes.
I made a presentation at OSCON a few years ago about how we did it that way when we imported .org. We had limited time to work in, and we had to do a lot of validation, so getting the data in quickly was important. So we split the data files up into segments and loaded them in parallel (Chris Browne did most of the implementation of this.) It was pretty helpful for loading, anyway. > A more interesting question is what sort of hardware you need for that > actually to be a win, though. Loading a few tables in parallel sounds > like an ideal recipe for oversaturating your disk bandwidth... Right, you need to be prepared for that. But of course, if you're in the situation where you have to get a given database up and running, who cares about the disk bandwidth? -- you don't have the database running yet. The kind of system that is busy enough to have that size of database and that urgency of recovery is also the kind that is likely to have dedicated storage hardware for that database. A -- Andrew Sullivan | [EMAIL PROTECTED] Unfortunately reformatting the Internet is a little more painful than reformatting your hard drive when it gets out of whack. --Scott Morris ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org/