Re: [GENERAL] why postgresql over other RDBMS

Andrew Sullivan Fri, 25 May 2007 14:47:56 -0700

On Fri, May 25, 2007 at 05:28:43PM -0400, Tom Lane wrote:
> That's true at the level of DDL operations, but AFAIK we could
> parallelize table-loading and index-creation steps pretty effectively
> --- and that's where all the time goes.


I made a presentation at OSCON a few years ago about how we did it
that way when we imported .org.  We had limited time to work in, and
we had to do a lot of validation, so getting the data in quickly was
important.  So we split the data files up into segments and loaded
them in parallel (Chris Browne did most of the implementation of
this.)  It was pretty helpful for loading, anyway.

> A more interesting question is what sort of hardware you need for that
> actually to be a win, though.  Loading a few tables in parallel sounds
> like an ideal recipe for oversaturating your disk bandwidth...

Right, you need to be prepared for that.  But of course, if you're in
the situation where you have to get a given database up and running,
who cares about the disk bandwidth? -- you don't have the database
running yet.  The kind of system that is busy enough to have that
size of database and that urgency of recovery is also the kind that
is likely to have dedicated storage hardware for that database.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
Unfortunately reformatting the Internet is a little more painful 
than reformatting your hard drive when it gets out of whack.
                --Scott Morris

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

               http://archives.postgresql.org/

Re: [GENERAL] why postgresql over other RDBMS

Reply via email to