Re: [Slony1-general] Large initial COPY transaction and a possible solution for the provider

Rod Taylor Tue, 06 Jun 2006 09:05:52 -0700

On Tue, 2006-06-06 at 10:25 -0500, Jim C. Nasby wrote:
> On Wed, May 31, 2006 at 10:01:26AM -0400, Rod Taylor wrote:
> > I've been thinking of the initial COPY process.
> > 
> > The problem is that with a large amount of data you end up with a very
> > large transaction on the data provider. The transaction on the
> > subscriber isn't as important since it will normally be an otherwise
> > idle database.
> > 
> > COPY in is one part, but building indexes on the subscriber is the
> > painful part and during much of this process the data provider has an
> > idle connection.
>  
> Pardon my ignorance, but is the provider actually sitting in a
> transaction while the subscriber is building indexes, and if so, why?
> ISTM there's no reason you'd need indexes (or RI for that matter) while
> loading data into a subscriber.


Yes it does. Indexes are mostly disabled during the copy itself then a
second pass is made after the COPY to re-enable indexes and rebuild
them. The provider is in a transaction for the same duration as the
subscriber.

That said, it doesn't really help much. The admin can remove the indexes
on the subscriber at the beginning and add them again at the end.

The big problem is the COPY. If you have 500GB or more data being
replicated between two nodes, and you wish to add a third, it is
impossible at the moment to break it up into smaller steps. The entire
dataset needs to be copied at the same time.


There might be a solution for adding additional nodes on the same
version of PostgreSQL using PITR type tricks, but between versions
you're cooked.

-- 

_______________________________________________
Slony1-general mailing list
[email protected]
http://gborg.postgresql.org/mailman/listinfo/slony1-general

Re: [Slony1-general] Large initial COPY transaction and a possible solution for the provider

Reply via email to