Re: [GENERAL] RFC tool to support development / operations work with slony replicated databases

2007-03-06 Thread Kenneth Downs

Andrew Hammond wrote:

Each release will include a directory that has the same name as the
full release tag. This directory must contain all the scripts to be
applied.


Have you considered using a data dictionary instead, so that you can 
simply diff the structures and generate DDL to bring a database to its 
current state?


If your scripts contain data modification code, they can be preserved, 
but using a dictionary smooths over a lot of the quirkiness of the 
scripts-based approach.



--
Kenneth Downs
Secure Data Software, Inc.
www.secdat.com / www.andromeda-project.org
Office: 631-689-7200   Cell: 631-379-0010

::Think you may have a problem with programming? Ask yourself this 
::question: do you worry about how to throw away a garbage can?



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] RFC tool to support development / operations work with slony replicated databases

2007-03-06 Thread Andrew Hammond
On Mar 6, 5:44 am, [EMAIL PROTECTED] (Kenneth Downs) wrote:
 Andrew Hammond wrote:
  Each release will include a directory that has the same name as the
  full release tag. This directory must contain all the scripts to be
  applied.

 Have you considered using a data dictionary instead,

Data dictionary is a pretty generic term, could you please show me
what you mean by this?

 so that you can
 simply diff the structures and generate DDL to bring a database to its
 current state?

To fully express the semantics of a change, DDL is often interspersed
with DML. A trivial example is refactoring a table to reflect a 1 to n
relationship. In order to do this successfully, you must first execute
DDL to create a new table, then execute DML to pull data from the old
table into the new table, and finally execute DDL to remove the
columns in the old table as necessary.

 If your scripts contain data modification code, they can be preserved,
 but using a dictionary smooths over a lot of the quirkiness of the
 scripts-based approach.

Again, I'd be very interested in seeing what you mean by a dictionary.

Andrew


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[GENERAL] RFC tool to support development / operations work with slony replicated databases

2007-03-05 Thread Andrew Hammond

Hello All,

I've been working on designing a tool to facilitate both developers
and operations staff working with slony replicated databases. I think
that the problem described below is a general problem for people
working with systems that are both in production and under on-going
development / maintenance. As a result I would like to both solicit
the input of the community and share the results. Documentation (which
is still somewhat drafty) follows.

Thank you for your time,
Andrew Hammond


Current Approach

A common problem in the database world is handling revisions to the
database that go with revisions in the software running against this
database. Currently our method is to include upgrade.sql and
downgrade.sql scripts with each software release.

Problem Statement

This will fail when we start using slony since we need to handle DML
differently from DDL and DCL. We also need a way to apply slonik
scripts. Ordering matters in the application of these scripts.

After talking about it for a while, we agreed that developers want a
way to apply their updates without stepping on each other's toes while
in the process of developing and testing their work.

Design

Interface

updatemydatabase -f target [-y] [--force-upgrade | --force-downgrade]
[-U pguser] [-h pghost] [-p pgport] [-d pgdatabase] [--cluster
clustername]

-f

   Optional Defaults to the current working directory. Specifies the
target intended to be upgraded to. This may be either the full or
relative path. This may be either a directory or a file.
-y

   Optional If set, assume yes to all questions. This is intended for
use when running the program in tool mode.
-U -h -p -d

   Optional As for psql and other PostgreSQL command line utilities.
--cluster

   Optional Defaults to the database name. Specifies the name of the
slony cluster to work with. This should have a one-letter short form
that conforms with other similar tools. Gotta figure out what those
are though...

   Since we will be using a python connector which is based on libqp,
we will auto-magically respect the standard postgres environment
variables including the .pgpass file for handling passwords.

Limitations

   * We are not trying to deal with databases with more than one
slony replication cluster in them.
   * We are not going to deal with the case where various sets have
different origins.
   * We assume that this is run off the same machine that is
currently running the slons. We can connect to every database in the
cluster.
   * Aside from generating the slonik preamble, we are not going to
try and auto-generate slonik scripts that do anything more complicated
than EXECUTE SCRIPT. At least not initially. Maybe we can get more
clever later?
   * We will not try to be clever about detecting changes to files.
Alfred floated the idea of using the SVN id tag to detect if a file
had been changed since it was last applied and then forcing a
downgrade/upgrade cycle. That seems like a lot of code for a corner
case. Alfred and Long agreed that it's probably a good idea to create
a convention instead. Do not edit files after they're committed unless
it will cause in-efficiencies in the application to the production
database. Instead, create a new file. If you are forced to edit a
committed file, then email the dev list.
   * Along the lines of not being clever, we assume there is only one
set, and that it's number is 1.
   * We will not assume the existence of a node 1. The whole point of
increasing availability by replicating is that we don't have to rely
on the existence of a given database.

Data Structure

Each release will include a directory that has the same name as the
full release tag. This directory must contain all the scripts to be
applied. The release may include directories of scripts from prior
releases in the same parent directory. The scripts may have an
arbitrary name, but must end with a suffix of either dml.sql, ddl.sql,
dcl.sql or slonik. Script names should incorporate the bug number
they're addressing.

   * /my/base/directory
 o 3.10.0
   + create_foo_23451.ddl.sql
   + populate_foo_23451.dml.sql
   + alter_bar_add_column_reference_foo_23451.ddl.sql
   + update_bar_reference_foo_23451.dml.sql
   + alter_bar_column_not_null_23451.ddl.sql
   + subscribe_foo_23451.slonik
   + cleanup_some_data_migration_stuff_23451.ddl.sql
   + fix_bug_24341.ddl.sql -- these are poorly chosen
names, but hey, it's an example...
   + fix_bug_24341.dml.sql
   + fix_bug_24341.slonik
   + drop_broken_node_30031.slonik
 o 3.10.1
   + another_table_29341.ddl.sql

Inside the script, we add some semantics to what are usually comments.
An example is probably the best way to show this.

-- alter_bar_column_not_null_23451.ddl.sql
-- Witty comment about why this column needs to be not null.
--dep