On Tue, 2008-02-26 at 12:46 +0100, Dimitri Fontaine wrote: > Le mardi 26 février 2008, Simon Riggs a écrit : > > So that would mean we would run an unload like this > > > > pg_dump --pre-schema-file=f1 --save-snapshot -snapshot-id=X > > pg_dump -t bigtable --data-file=f2.1 --snapshot-id=X > > pg_dump -t bigtable2 --data-file=f2.2 --snapshot-id=X > > pg_dump -T bigtable -T bigtable2 --data-file=f2.3 --snapshot-id=X > > As a user I'd really prefer all of this to be much more transparent, and > could > well imagine the -Fc format to be some kind of TOC + zip of table data + post > load instructions (organized per table), or something like this. > In fact just what you described, all embedded in a single file.
If its in a single file then it won't perform as well as if its separate files. We can put separate files on separate drives. We can begin reloading one table while another is still unloading. The OS will perform readahead for us on single files whereas on one file it will look like random I/O. etc. I'm not proposing we change things to use separate files in all cases. Just when you want to use separate files, you can. > And I'd much prefer it if this (new?) format was trustworthy enough to be the > new default format of -Fc dumps. Then we could add some *simple* command line > parameter to control the threading behavior of the dump and reload process, > ala make -j. We could even support some option for the user to tell us which > disk arrays to use for parallel dumping. > > pg_dump -j2 --dumpto=/mount/sda:/mount/sdb ... > mydb.dump > pg_restore -j4 ... mydb.dump I like the -j syntax. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match