Any of you have experience with Bucardo or pgpool-II as a replication
add-on?

Some background:
I switched from MySQL to PostgreSQL very cleanly using web2py as the
vehicle. Sort description to document the process: Made a copy of the
app, removed the content of the databases directory, added the
prerequisite components (database and driver) to the system, created
an empty DB, changed the connection string in the model, started MySQL
verison of app in shell mode and ran all the data out to one CSV file
and finally started the PostgreSQL version up in shell mode and did an
import of the same CSV file followed by a db.commit(). After all that
the application worked except for one group by orderby query
PostgreSQL didn't like which was easy to fix and the change worked in
MySQL as well. This was a database with 28 tables linked with lots of
relations.

My compliments to this great application server and infrastructure
surrounding it. Of the available migration tools I found out on the
net, most failed to work and would require extensive manual editing.

The application will be installed in 10 locations scattered all over
Alaska. All the locations are connected by a WAN with IPSEC to form a
VPN so it looks like it is all in the same room except for network
performance.

Each location must survive a network outage and continue to work, The
weather can be a problem up there.

Any data tables that change rapidly are to remain local to each
location.

About 2/3 of the database is configuration information which changes
very slowly. One table if this were running would have changed once in
5 years. Some tables change more often as employees come and go or
equipment is added to a location. Config changes can be delayed by
downed connections so eventual consistency is okay.

I need something automatic since the people using the system are not
technical and cannot be depended on to to a task.

By restricting the application I could get the updates to one database
instance but then there is a time delay until the local copy is in
sync.

I looked at MySQL Replication as described in the Linux Journal
article July 2010 where they do a ring which has each server to the
left in the ring is master to the slave to the right but I could see
with intermittent networks down due to bad weather this could be a
headache waiting to happen. Also the MySQL licensing has a degree of
uncertainty to it so I would rather stay away.

I am aware of PostgreSQL-R which is in beta, The uuid and timestamps
available in web2py model help but then needs to be driven by cron in
a batch oriented update.

Sometimes I look at the NoSQL databases like CouchDB or MongoDB but
then the foreign keys from the rapidly changing data is a problem. I
could bridge it using equipment hostname or something like that but
still I would rather stay inside one database.

Hard problem to solve completely I think.

Comments?

Thanks

Ron

Reply via email to