On Mar 18, 2010, at 7:57 PM, Adam Heath wrote:

> Marc Morin wrote:
>> I loath to enter into this discussion, but feel that I need to put in a bit 
>> of experience that I have observed when dealing with databases and migration 
>> issues.
>> 
>> There are a couple of truisms that I have come to accept, grudgingly:
>> 
>> 1) Database schemas will change over time.
>> 2) Database schema changes are a pain to deal with.  Migration issue are 
>> difficult.
>> 3) Unlike changes in code, that have no "history" to them (other than an 
>> interface specification to external api), database changes, have a long time 
>> horizon.  Decisions that were rational that lead to a schema design at the 
>> time, but new use cases, etc.. result in re-factoring or enhancing cause you 
>> to continually deal with the implications of this "old" design.
>> 4) It's impossible to satisfy all schema API revision constraints at the 
>> same time.
>> 5) You need to have a formal way to manage migration of data in your 
>> framework.
>> 6) Only support the newest schema, don't leave legacy artifacts around in 
>> schema, tends to increase duplication, increased confusion, and increased 
>> "code entropy".
>> 7) Avoid schema as an API point in your system if at all possible... can 
>> really tie your hands from a development point of view.  Means other systems 
>> will need to be updated when merging...
>> 
>> So, Ofbiz doesn't really have any formal migration process. checkdb() adds 
>> missing stuff, but is incomplete in some changes (column type changes, index 
>> changes, pkey changes, drops, null/not null changes), but more importantly, 
>> the data migration framework is meant to be outside, and manual... leading 
>> to problems, inconstancies, and more importantly, difficulties.
>> 
>> The "rename" the old entity, create a migration service, then manually run 
>> it, is a "weak" migration framework.
>> 
>> I have had success when trying to keep a tighter lid on things:
>> 
>> 1- automatically create .sql migration files representing the differences in 
>> schemas. (rev<n> - rev<n-1> differences).
>> 2- automatically add insert/update/delete commands representing the 
>> differences in seed into these files.
>> 3- manually edit these files, or generate new ones to represent migration of 
>> data, where the "simple" cases are not handled, such as moving data from one 
>> entity, to another, etc...
>> 4- these sql migration commands become formally part of the codebase.  
>> Migration scripts automatically run on older schema versions (trick is the 
>> detect what version the schema actually is.... can't rely on version number, 
>> etc... too easy to be inaccurate, and run all "needed" migration commands).
>> 5- Only supporting the "current" schema in the code base.  This means that 
>> as the schema evolves, code that isn't in the project will need to be 
>> upgraded when merging in these changes.
>> 
>> Obviously the above series of steps are meant to move all "legacy" databases 
>> through an upgrade process to the current revision, such that they should be 
>> indistinguishable from being created from a newly minted schema.
>> 
>> I have done this type of support in my past life, with another project, 
>> other than ofbiz and it helped tame the affects of schema changes.  We also 
>> have developed this approach at Emforium to manage the upgrade of our 
>> customer's ofbiz instances.  It has been VERY DIFFICULT to keep these 
>> instances consistent with one another, as the schema has changed over 
>> time.... I won't say that this is 100% nailed down, but we are on our way.
>> 
>> So, I'd be interested in discussing how we can add a more formal migration 
>> declaration into ofbiz, with forward and backward migrations formally put 
>> into place (like ruby on rails, etc...). 
>> 
>> We can talk about using our developed framework for this, but there are 
>> limitations; such as .sql files are not likely portable, and certainly can't 
>> be easily split up between datasources.
>> 
>> We've also generated a number of self tests, to test that a db at rev<n-1> 
>> can be upgraded to rev<n> with the seed to be identical, this is CRUCIAL to 
>> ensure, that the migration chain is unbroken...
> 
> 
> I've got code started to handle this, from a few months ago.
> 
> <UpgradeScript id="FooBarBaz" location="component://$name/path/file.ext"/>
> 
> Then, at startup, a separate, special delegator is created very early,
> which uses a private datastore(derby, postgres, or whatever).  Any
> newly imported UpgradeScripts are then run just this once.
> 
> My code had support for script dependencies, ala debian, and could
> order things appropriately.  The idea is based on another package I
> saw developed, which did database stuff the old fashion way, with a
> bunch of sql scripts individually developed for each database vendor.
> The package stored the current version somewhere, and it just ran all
> upgrade scripts in series from the last installed version to the new
> version.  This would handle multi-year upgrades, but it required
> keeping those upgrade scripts around forever.  However, the scripts
> were isolated, so the mainline code was kept clean.
> 

This sounds really promising Adam.

Reply via email to