Re: [Ledger-smb-devel] Managing database change: improvements needed

Erik Huelsmann Sun, 13 Sep 2015 03:22:30 -0700

On Sat, Sep 12, 2015 at 4:27 PM, R. Ransbottom <[email protected]> wrote:

> On Fri, Sep 11, 2015 at 02:35:58PM +0200, Erik Huelsmann wrote:
>
> > > For myself, I am leaning toward setting up a local repository.
> > > I would have liked to avoid adding a larger version control
> > > requirement for my eventual replacement.
>
> > sure, can you describe how you think you will track the main repository
> and
> > how you will manage the database schema changes?
>
> As a user, I'd have the trunk, branches for local production and local
> development.  I would look to have infrequent upgrades--roughly annually.
>
> Beyond that: a branch to give back.
>
> Most local changes are simplistic, storing and displaying extra data
> fields.
> This was the territory I thought lsmb might have some protocols or
> conventions to help manage or just prevent name collisions.  Annually,
> checking them and rebuilding those would not be too onerous; that without
> a VCS.
>

Ok. That's pretty straight forward indeed.

> > For minor local changes, managing the sql model is relatively easy,
> > > it is the user view that tends to grow into the odd cracks.
>
> >
> > Could you elaborate what you mean by this statement? If the sql model of
> > your repository starts to diverge from the sql model in the main
> > repository, how is that relatively easy to manage?
>
> I meant that _small_ changes to the SQL, like above, are _relatively_
> easy to find/track without a VCS.  That is about the nature of data
> definition languages.  If the SQL code is spread around, it is still
> findable (\dS tablename, pg_dump).  From 1.4.0 to 1.4.15, there have not
> been that many changes to the database.
>

Right. That's the intent, really: the tables shouldn't change at all
(preferably) during a minor release series. The stored procedures would
change in so far as to fix bugs (preferably  without changing the
functional API's). However, sometimes that doesn't fully work...

> (So far the dojo code is pretty obscure to me, but I expect it to
> have other concerns when it is creating tables--anyway I will do Dojo 101.)
>
> I don't have any special wisdom in dealing with database change.
> But with a core business transaction application, I do have caution.
>

Ok. That's sane. With the small scripts that Sqitch provides, you'd be
handed the differences between the various schema versions by us in the
form of a list of script names (the Sqitch plan) and  a number of scripts.

> > > >>> We have a main schema file for the table definitions and a number
> of
> > > > >>> modules with stored procedures grouped by "subject".
> > >
> > > > >>> Next to that, there's a "Fixes" file, which contains all
> incremental
> > > > >>> updates to the schema since some version.
> > >
> > > > >>> When I want to change the schema as part of our development, I
> need
> > > to
> > > > >>> change the schema definition files *and* I need to add the schema
> > > changes
> > > > >>> to the Fixes.sql file.
>
> I would use make to generate Pg-database.sql from Fixes.sql.
> Pg-database.sql being refactored and Fixes.sql destroyed at appropriate
> milestones--stopping stones for user upgrades.
>

Exactly. The current plan is to generate a Pg-database.sql for *every*
release. People can then quickly set up new databases from scratch. People
with pre-existing databases can have the scripts applied that the releases
between their version and the newly installed release comes with. Sqitch
will help that process by determining which ones that are.

> Yes, the little scripts *will* be written over time, because we have small
>
> Point taken.
>
> > changes to the database schema every now and then. Currently these small
> > scripts are all "lumped" into a single large file called Fixes.sql. The
>
> I feel there is a qualitative difference to adding to a script versus
> repeatedly starting on a blank page--context, style, less naming.
>

> that they're in will fail, leading to humongous numbers of errors in the
> > database creation/upgrade logs. As a result, it's very hard for people to
> > report problems to our mailing list and even worse, users may have doubts
> > about the quality of the software, with these huge numbers of errors.
>
> I'd try to write a script with good error catching.
>

It's not impossible, but a *lot* more work than tracking which changes have
been applied using a program like Sqitch. As in: so far as I currently
know, we'd have to query the schema catalog on every change and only
execute the schema change if the schema catalog indicates it's not already
been applied. There are 2 ways I know how to do this:

1) Number/name the scripts and keep a catalog of the names/numbers, which
can be queried by the update program
2) Query PostgreSQL's own schema catalog for specific characteristics and
conditionally execute the DDL change commands

(2) would AFAIK require coding a specific stored procedure for each change,
while (1) is exactly the direction I'm proposing (by using Sqitch to do it
for us, that is).

> Wondering what you mean here. Are you planning to copy over all data from
> > one version to another every time you upgrade say from 1.4.x -> 1.4.y
>
> Yes.  What else?
>

LedgerSMB uses a different approach where we modify the existing schema by
adding columns, foreign keys, other constraints and stored procedures (or
removing them, where appropriate). You could use the same approach. If we
were to go with the use of Sqitch, you could even develop your "add my
columns" scripts, add them to Sqitch, execute the Sqitch changes and then,
upon update to 1.4.y, we'd provide you with the scripts to update your
schema from 1.4.(x') to 1.4.(y'). Where (x') is the schema including your
changes and (y') is the schema for the new version, including your changes
(as long as there are no conflicts, of course).

This way, you get the benefit of re-using the efforts from the project and
a full re-analysis of the schema wouldn't be necessary.

> (x<y)? If so, are you planning to develop scripts every time?
>
> Yes, though I expect much reuse in the typical cases.
>

Agreed. That's very much to be expected.

> > One reason why I like the way Sqitch works is that you'll be able to put
> > your schema changes in sqitch files and deploy those on your LedgerSMB
> > version. Then, when a new release comes out, you simply add those to the
> > end of the list of sqitch files and run the standard LedgerSMB upgrade
> > procedure. LedgerSMB will then simply upgrade the database with your
> > modifications in it.
>
> This sounds very much like how it is done now: q/[Ss]qitch//g.
> Is writing three scripts making things less difficult?
>

No :-) But Sqitch does a number of things we currently don't do:

1) Check deployment consistency before deployment of new changes
2) Allow verification of the deployment afterwards
3) Prevent duplicate application of changes,
     --> making the deployment logs clean of errors
          --> no longer frightening users who take a peek in this log
4) Provide a downgrade path

> http://stories.iovation.com/why-we-sqitch seems like an example where
> sqitch is worth the effort.
>

Right. I see what you mean and I basically agree. However, we look at this
on a different scale, I think: I'm looking at it from the perspective of
the software that the LedgerSMB project distributes. This means thousands
of databases across the world in way more than 2 data centers (sometimes
just at home), completely out of the control of the project, where
deployments should "simply work", upgrading from anywhere from 1.2.<latest>
or SQL Ledger 2.6/2.8 to the version being upgraded to.

I'd like the switch to Sqitch to help people with local schema
modifications to maintain their modified database and enjoy the regular
updates of the project. How exactly this should pan out is something we can
design now, but I think we'll find out more as we go too. I sure would like
to use this opportunity to formulate and formalize the advice you were
looking for but not finding in exactly this matter.

Please note that I'm very much valueing your feedback, although the above
might not come across that way (at least I'm not sure it does). I've tried
to explain my line(s) of reasoning more than trying to persuade anybody to
do anything.

-- 
Bye,

Erik.

http://efficito.com -- Hosted accounting and ERP.
Robust and Flexible. No vendor lock-in.

------------------------------------------------------------------------------

_______________________________________________
Ledger-smb-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ledger-smb-devel

Re: [Ledger-smb-devel] Managing database change: improvements needed

Reply via email to