On Mar 18, 2010, at 7:57 PM, Adam Heath wrote: > Marc Morin wrote: >> I loath to enter into this discussion, but feel that I need to put in a bit >> of experience that I have observed when dealing with databases and migration >> issues. >> >> There are a couple of truisms that I have come to accept, grudgingly: >> >> 1) Database schemas will change over time. >> 2) Database schema changes are a pain to deal with. Migration issue are >> difficult. >> 3) Unlike changes in code, that have no "history" to them (other than an >> interface specification to external api), database changes, have a long time >> horizon. Decisions that were rational that lead to a schema design at the >> time, but new use cases, etc.. result in re-factoring or enhancing cause you >> to continually deal with the implications of this "old" design. >> 4) It's impossible to satisfy all schema API revision constraints at the >> same time. >> 5) You need to have a formal way to manage migration of data in your >> framework. >> 6) Only support the newest schema, don't leave legacy artifacts around in >> schema, tends to increase duplication, increased confusion, and increased >> "code entropy". >> 7) Avoid schema as an API point in your system if at all possible... can >> really tie your hands from a development point of view. Means other systems >> will need to be updated when merging... >> >> So, Ofbiz doesn't really have any formal migration process. checkdb() adds >> missing stuff, but is incomplete in some changes (column type changes, index >> changes, pkey changes, drops, null/not null changes), but more importantly, >> the data migration framework is meant to be outside, and manual... leading >> to problems, inconstancies, and more importantly, difficulties. >> >> The "rename" the old entity, create a migration service, then manually run >> it, is a "weak" migration framework. >> >> I have had success when trying to keep a tighter lid on things: >> >> 1- automatically create .sql migration files representing the differences in >> schemas. (rev<n> - rev<n-1> differences). >> 2- automatically add insert/update/delete commands representing the >> differences in seed into these files. >> 3- manually edit these files, or generate new ones to represent migration of >> data, where the "simple" cases are not handled, such as moving data from one >> entity, to another, etc... >> 4- these sql migration commands become formally part of the codebase. >> Migration scripts automatically run on older schema versions (trick is the >> detect what version the schema actually is.... can't rely on version number, >> etc... too easy to be inaccurate, and run all "needed" migration commands). >> 5- Only supporting the "current" schema in the code base. This means that >> as the schema evolves, code that isn't in the project will need to be >> upgraded when merging in these changes. >> >> Obviously the above series of steps are meant to move all "legacy" databases >> through an upgrade process to the current revision, such that they should be >> indistinguishable from being created from a newly minted schema. >> >> I have done this type of support in my past life, with another project, >> other than ofbiz and it helped tame the affects of schema changes. We also >> have developed this approach at Emforium to manage the upgrade of our >> customer's ofbiz instances. It has been VERY DIFFICULT to keep these >> instances consistent with one another, as the schema has changed over >> time.... I won't say that this is 100% nailed down, but we are on our way. >> >> So, I'd be interested in discussing how we can add a more formal migration >> declaration into ofbiz, with forward and backward migrations formally put >> into place (like ruby on rails, etc...). >> >> We can talk about using our developed framework for this, but there are >> limitations; such as .sql files are not likely portable, and certainly can't >> be easily split up between datasources. >> >> We've also generated a number of self tests, to test that a db at rev<n-1> >> can be upgraded to rev<n> with the seed to be identical, this is CRUCIAL to >> ensure, that the migration chain is unbroken... > > > I've got code started to handle this, from a few months ago. > > <UpgradeScript id="FooBarBaz" location="component://$name/path/file.ext"/> > > Then, at startup, a separate, special delegator is created very early, > which uses a private datastore(derby, postgres, or whatever). Any > newly imported UpgradeScripts are then run just this once. > > My code had support for script dependencies, ala debian, and could > order things appropriately. The idea is based on another package I > saw developed, which did database stuff the old fashion way, with a > bunch of sql scripts individually developed for each database vendor. > The package stored the current version somewhere, and it just ran all > upgrade scripts in series from the last installed version to the new > version. This would handle multi-year upgrades, but it required > keeping those upgrade scripts around forever. However, the scripts > were isolated, so the mainline code was kept clean. >
This sounds really promising Adam.