Re: [GENERAL] replication from multiple master servers to a single read-only slave

2010-01-12 Thread Dimitri Fontaine
Omar Mehmood omarmehm...@yahoo.com writes:
 I really don't want to use separate schemas for each master to
 logically partition the data.  I ensure that the data on each master
 will not clash with each other (in terms of any DB level contraints
 such as PK), so I'd much prefer they all reside in a single schema.
 Also, my understanding is that Slony uses DB triggers to track changes
 (but I want to avoid using DB triggers).

If you want to avoid all PostgreSQL features… well I don't see that I
can help you here.

If you were to change your mind the following document talks about how
to federate data from several databases to the same central one, and
using either inheritance or triggers to move the incoming data from the
N origin schemas to the central one.

I guess how the data gets to being available for your central queries is
not solved by refusing to use a schema per origin server.

  http://wiki.postgresql.org/wiki/Londiste_Tutorial#Federated_database

 Another additional constraint-- the master servers may not always have
 connectivity to the slave machine, so the chosen mechanism needs to be
 robust and not assume 100% uptime.

Londiste fits this need.

Regards,
-- 
dim

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] replication from multiple master servers to a single read-only slave

2010-01-12 Thread Joshua Tolley
On Mon, Jan 11, 2010 at 04:18:30PM -0800, Omar Mehmood wrote:
 (but I want to avoid using DB triggers).

snip

 I will check out Bucardo.

Bucardo uses triggers just like Slony does. That said, it seems strange that
you'd want to avoid them. Is there any particular reason you want to avoid
them?

Bucardo should handle the disconnection problems you described just fine.

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com


signature.asc
Description: Digital signature


Re: [GENERAL] replication from multiple master servers to a single read-only slave

2010-01-12 Thread Omar Mehmood
Thanks for note.  I'm trying to avoid triggers based on past trauma from going 
into a client setup that uses a zillion of them and having to wait in order of 
seconds for a simple mod DML statement to execute :)  I've been looking for an 
alternative that either sits above the database level (without building my own) 
or hoping there is a binary logging option.  Anyway, I've been reading up on 
triggers, replication and synchronization options for PostgreSQL and it seems 
like the performance impact isn't too bad.  In particular, I've been reading up 
on Bucardo and it seems to address all my needs-- I also fired off an email to 
Selena D on it and she's also affirmed that it looks to solve my problem.  I'm 
going to setup an environment for testing and then post my questions directly 
to the bucardo-users mailing list.

Omar

--- On Tue, 1/12/10, Joshua Tolley eggyk...@gmail.com wrote:

 From: Joshua Tolley eggyk...@gmail.com
 Subject: Re: [GENERAL] replication from multiple master servers to a single 
 read-only slave
 To: Omar Mehmood omarmehm...@yahoo.com
 Cc: Ben Chobot be...@silentmedia.com, pgsql-general@postgresql.org
 Date: Tuesday, January 12, 2010, 2:12 PM
 On Mon, Jan 11, 2010 at 04:18:30PM
 -0800, Omar Mehmood wrote:
  (but I want to avoid using DB triggers).
 
 snip
 
  I will check out Bucardo.
 
 Bucardo uses triggers just like Slony does. That said, it
 seems strange that
 you'd want to avoid them. Is there any particular reason
 you want to avoid
 them?
 
 Bucardo should handle the disconnection problems you
 described just fine.
 
 --
 Joshua Tolley / eggyknap
 End Point Corporation
 http://www.endpoint.com
 


  

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] replication from multiple master servers to a single read-only slave

2010-01-11 Thread Omar Mehmood
I'm wondering if it's possible to have a setup with multiple master servers 
replicating to a single slave.  I can guarantee that each server will generate 
unique PK values for all tables and all the data is partitioned (logically by 
server) across the servers.  I would simply like to have a read-only slave that 
is a picture of all the servers' data (relatively up to date).  The individual 
master servers never need to know about each other's data (i.e. they do not 
_need_ to sync with each other, nor do I want them to be sync'd).

Would it be possible to use PostgreSQL PITR feature to support this 
functionality ?  All of the data created/updated/deleted per server is unique 
to that server, so replaying the log to the slave should technically be safe 
and the replaying logs from multiple servers should be safe as well (as long as 
the relative order of replay is preserved).  I'm just wondering how to get 
around the numbering of the log (WAL) files and the slave's tracking of the log 
files that it has already processed.

I can certainly write my own application log module that runs on each server, 
ship over the log to the slave machine and replay the logs to the slave (in the 
meanwhile ensuring that the order of replay is preserved and all that good 
stuff), but I'm trying to find a quick(er) solution for the short term.

Please note that I'd like to avoid using PostgreSQL data partitioning as well 
as any DB triggers (in case anyone was going to go down that path as part of 
the solution).

Omar


  

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] replication from multiple master servers to a single read-only slave

2010-01-11 Thread David Fetter
On Mon, Jan 11, 2010 at 03:02:18PM -0800, Omar Mehmood wrote:
 I'm wondering if it's possible to have a setup with multiple
 master servers replicating to a single slave.  I can guarantee
 that each server will generate unique PK values for all tables and
 all the data is partitioned (logically by server) across the
 servers.  I would simply like to have a read-only slave that is a
 picture of all the servers' data (relatively up to date).  The
 individual master servers never need to know about each other's
 data (i.e. they do not _need_ to sync with each other, nor do I
 want them to be sync'd).
 
 Would it be possible to use PostgreSQL PITR feature to support this
 functionality ?

No, but you could use something like Slony to do this.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] replication from multiple master servers to a single read-only slave

2010-01-11 Thread Omar Mehmood
Thanks for the suggestions.

I really don't want to use separate schemas for each master to logically 
partition the data.  I ensure that the data on each master will not clash with 
each other (in terms of any DB level contraints such as PK), so I'd much prefer 
they all reside in a single schema.  Also, my understanding is that Slony uses 
DB triggers to track changes (but I want to avoid using DB triggers).

Another additional constraint-- the master servers may not always have 
connectivity to the slave machine, so the chosen mechanism needs to be robust 
and not assume 100% uptime.

For truly simple, I could do a periodic data dump of the database on the 
server, ship to slave, and restore.  However, this is way overkill since there 
won't be that many changes in the data for the period of time that it will run 
(e.g. every 15-20 minutes).  Plus, I might end up in the situation where the 
slave will start to increasingly lag behind over time (depends on the number of 
master servers and amount of data).

I will check out Bucardo.

Omar

--- On Mon, 1/11/10, Ben Chobot be...@silentmedia.com wrote:

 From: Ben Chobot be...@silentmedia.com
 Subject: Re: [GENERAL] replication from multiple master servers to a single 
 read-only slave
 To: Omar Mehmood omarmehm...@yahoo.com
 Date: Monday, January 11, 2010, 6:58 PM
 I'm pretty sure you can do this with
 Bucardo, and I *think* you can do it via Slony, if you're
 willing to use seperate schemas for each master. I know you
 cannot do this with PITR.
 
 On Jan 11, 2010, at 4:02 PM, Omar Mehmood wrote:
 
  I'm wondering if it's possible to have a setup with
 multiple master servers replicating to a single
 slave.  I can guarantee that each server will generate
 unique PK values for all tables and all the data is
 partitioned (logically by server) across the servers. 
 I would simply like to have a read-only slave that is a
 picture of all the servers' data (relatively up to
 date).  The individual master servers never need to
 know about each other's data (i.e. they do not _need_ to
 sync with each other, nor do I want them to be sync'd).
  
  Would it be possible to use PostgreSQL PITR feature to
 support this functionality ?  All of the data
 created/updated/deleted per server is unique to that server,
 so replaying the log to the slave should technically be safe
 and the replaying logs from multiple servers should be safe
 as well (as long as the relative order of replay is
 preserved).  I'm just wondering how to get around the
 numbering of the log (WAL) files and the slave's tracking of
 the log files that it has already processed.
  
  I can certainly write my own application log module
 that runs on each server, ship over the log to the slave
 machine and replay the logs to the slave (in the meanwhile
 ensuring that the order of replay is preserved and all that
 good stuff), but I'm trying to find a quick(er) solution for
 the short term.
  
  Please note that I'd like to avoid using PostgreSQL
 data partitioning as well as any DB triggers (in case anyone
 was going to go down that path as part of the solution).
  
  Omar
  
  
  
  
  -- 
  Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
  To make changes to your subscription:
  http://www.postgresql.org/mailpref/pgsql-general
 
 




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] replication from multiple master servers to a single read-only slave

2010-01-11 Thread Craig Ringer
Omar Mehmood wrote:
 Would it be possible to use PostgreSQL PITR feature to support this
 functionality ?  All of the data created/updated/deleted per server
 is unique to that server, so replaying the log to the slave should
 technically be safe and the replaying logs from multiple servers
 should be safe as well (as long as the relative order of replay is
 preserved).  I'm just wondering how to get around the numbering of
 the log (WAL) files and the slave's tracking of the log files that it
 has already processed.

No, it can't be done with PITR and WAL-shipping. The write-ahead logs
are at too low a level and rely on the block layout of the Pg cluster.
WAL-shipping only works where master and slave start out with the exact
same data directory contents, with all the same block layout, same oids
for tables/types/etc. Just having the same tuples in tables of the same
names is not sufficient.

Given that you can't even WAL-ship from a master to a slave created by
pg_restore from a dump of the master, you can probably see why
WAL-shipping from multiple masters absolutely cannot work.

For this, you need something higher level that replicates at the
tuple-change level. A trigger-based system like Slony or Bucardo is most
likely to fit your needs.

 Please note that I'd like to avoid using PostgreSQL data partitioning
 as well as any DB triggers (in case anyone was going to go down that
 path as part of the solution).

At present Pg doesn't offer safe and convenient C-level hooks for
replication systems to attach to in order to record tuple changes.
Recording of tuple changes for replication must be done with triggers.
As you need a tuple-change level replication system, you're pretty much
out of luck.

--
Craig Ringer

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general