Le mardi 26 février 2008, Tom Dunstan a écrit :
> On Tue, Feb 26, 2008 at 5:35 PM, Simon Riggs <[EMAIL PROTECTED]> wrote:
> > Le mardi 26 février 2008, Dimitri Fontaine a écrit :
> >> We could even support some option for the user to tell us which
> >> disk arrays to use for parallel dumping.
> >>
> >>  pg_dump -j2 --dumpto=/mount/sda:/mount/sdb ... > mydb.dump
> >>  pg_restore -j4 ... mydb.dump
> >
> >  If its in a single file then it won't perform as well as if its separate
> >  files. We can put separate files on separate drives. We can begin
> >  reloading one table while another is still unloading. The OS will
> >  perform readahead for us on single files whereas on one file it will
> >  look like random I/O. etc.
>
> Yeah, writing multiple unknown-length streams to a single file in
> parallel is going to be all kinds of painful
[...]
> While it's a bit fiddly, putting data on separate drives would then
> involve something like symlinking the tablename inside the dump dir
> off to an appropriate mount point, but that's probably not much worse
> than running n different pg_dump commands specifying different files.
> Heck, if you've got lots of data and want very particular behavior,
> you've got to specify it somehow. :)

What I meant with the --dumpto=/mount/sda:/mount/sdb idea was that pg_dump 
would unload data to those dirs (filesystems/disk array/whatever) then 
prepare the final zip file from here.

We could even choose for the --dumpto option to associate each entry to a 
process, or have a special TOC syntax which allows for complex setups, and 
have pg_dump dump first a TOC you edit, then use the edited version to 
control the parallel unloading, disks to use for which tables, etc.

That is exactly your ideas, but with a try to make them appear clear and 
simple from a user point of view, so with some more work to get done by the 
tools.
-- 
dim

Attachment: signature.asc
Description: This is a digitally signed message part.

Reply via email to