Hello All,
I've been working on designing a tool to facilitate both developers
and operations staff working with slony replicated databases. I think
that the problem described below is a general problem for people
working with systems that are both in production and under on-going
development / maintenance. As a result I would like to both solicit
the input of the community and share the results. Documentation (which
is still somewhat drafty) follows.
Thank you for your time,
Andrew Hammond
Current Approach
A common problem in the database world is handling revisions to the
database that go with revisions in the software running against this
database. Currently our method is to include upgrade.sql and
downgrade.sql scripts with each software release.
Problem Statement
This will fail when we start using slony since we need to handle DML
differently from DDL and DCL. We also need a way to apply slonik
scripts. Ordering matters in the application of these scripts.
After talking about it for a while, we agreed that developers want a
way to apply their updates without stepping on each other's toes while
in the process of developing and testing their work.
Design
Interface
updatemydatabase -f target [-y] [--force-upgrade | --force-downgrade]
[-U pguser] [-h pghost] [-p pgport] [-d pgdatabase] [--cluster
clustername]
-f
Optional Defaults to the current working directory. Specifies the
target intended to be upgraded to. This may be either the full or
relative path. This may be either a directory or a file.
-y
Optional If set, assume yes to all questions. This is intended for
use when running the program in tool mode.
-U -h -p -d
Optional As for psql and other PostgreSQL command line utilities.
--cluster
Optional Defaults to the database name. Specifies the name of the
slony cluster to work with. This should have a one-letter short form
that conforms with other similar tools. Gotta figure out what those
are though...
Since we will be using a python connector which is based on libqp,
we will auto-magically respect the standard postgres environment
variables including the .pgpass file for handling passwords.
Limitations
* We are not trying to deal with databases with more than one
slony replication cluster in them.
* We are not going to deal with the case where various sets have
different origins.
* We assume that this is run off the same machine that is
currently running the slons. We can connect to every database in the
cluster.
* Aside from generating the slonik preamble, we are not going to
try and auto-generate slonik scripts that do anything more complicated
than EXECUTE SCRIPT. At least not initially. Maybe we can get more
clever later?
* We will not try to be clever about detecting changes to files.
Alfred floated the idea of using the SVN id tag to detect if a file
had been changed since it was last applied and then forcing a
downgrade/upgrade cycle. That seems like a lot of code for a corner
case. Alfred and Long agreed that it's probably a good idea to create
a convention instead. Do not edit files after they're committed unless
it will cause in-efficiencies in the application to the production
database. Instead, create a new file. If you are forced to edit a
committed file, then email the dev list.
* Along the lines of not being clever, we assume there is only one
set, and that it's number is 1.
* We will not assume the existence of a node 1. The whole point of
increasing availability by replicating is that we don't have to rely
on the existence of a given database.
Data Structure
Each release will include a directory that has the same name as the
full release tag. This directory must contain all the scripts to be
applied. The release may include directories of scripts from prior
releases in the same parent directory. The scripts may have an
arbitrary name, but must end with a suffix of either dml.sql, ddl.sql,
dcl.sql or slonik. Script names should incorporate the bug number
they're addressing.
* /my/base/directory
o 3.10.0
+ create_foo_23451.ddl.sql
+ populate_foo_23451.dml.sql
+ alter_bar_add_column_reference_foo_23451.ddl.sql
+ update_bar_reference_foo_23451.dml.sql
+ alter_bar_column_not_null_23451.ddl.sql
+ subscribe_foo_23451.slonik
+ cleanup_some_data_migration_stuff_23451.ddl.sql
+ fix_bug_24341.ddl.sql -- these are poorly chosen
names, but hey, it's an example...
+ fix_bug_24341.dml.sql
+ fix_bug_24341.slonik
+ drop_broken_node_30031.slonik
o 3.10.1
+ another_table_29341.ddl.sql
Inside the script, we add some semantics to what are usually comments.
An example is probably the best way to show this.
-- alter_bar_column_not_null_23451.ddl.sql
-- Witty comment about why this column needs to be not null.
--dep