As the OP, I'll just note that my organization would definitely find use for a parallel migrator tool as long as it supported doing a selection of tables (i.e. -t / -T) in addition to the whole database and it supported or we were able to patch in an option to cluster as part of the migration (the equivalent of something like https://github.com/tgarnett/postgres/commit/cc320a71 ).
Tim On Wed, Apr 24, 2013 at 5:47 PM, Joachim Wieland <j...@mcknight.de> wrote: > On Wed, Apr 24, 2013 at 4:05 PM, Stefan Kaltenbrunner < > ste...@kaltenbrunner.cc> wrote: > >> > What might make sense is something like pg_dump_restore which would have >> > no intermediate storage at all, just pump the data etc from one source >> > to another in parallel. But I pity the poor guy who has to write it :-) >> >> hmm pretty sure that Joachims initial patch for parallel dump actually >> had a PoC for something very similiar to that... > > > That's right, I implemented that as an own output format and named it > "migrator" I think, which wouldn't write each stream to a file as the > directory output format does but that instead pumps it back into a restore > client. > > Actually I think the logic was even reversed, it was a parallel restore > that got the data from internally calling pg_dump functionality instead of > from reading files... The neat thing about this approach was that the order > was optimized and correct, i.e. largest tables start first and dependencies > get resolved in the right order. > > I could revisit that patch for 9.4 if enough people are interested. > > Joachim >