Some more info: It seems Sqitch wants to address the issue of multiple branches receiving the same patches and upgrading from one branch to another, like we want to support: https://github.com/theory/sqitch/issues/200
Regards, Erik. On Wed, Aug 26, 2015 at 9:13 PM, Erik Huelsmann <[email protected]> wrote: > Hi, > > Hi all, >> >> There are a few issues that I have with the way we currently handle >> database schema changes. >> >> Current situation >> >> ============ >> >> We have a main schema file for the table definitions and a number of >> modules with stored procedures grouped by "subject". >> >> Next to that, there's a "Fixes" file, which contains all incremental >> updates to the schema since some version. >> >> When I want to change the schema as part of our development, I need to >> change the schema definition files *and* I need to add the schema changes >> to the Fixes.sql file. >> >> >> I agree this is less than ideal. I think we need to think of a >> comprehensive tooling approach and there are a few reasons we haven't done >> this (mainly because I think getting the tooling right is a lot of effort >> and there has in the past been other urgent things that needed more effort >> at the beginning). However I think this is growing in urgency as we >> continue development. >> > Right. I'm merely summarizing for those who didn't know the current > situation and to support my line of reasoning. (I don't think you think > that I am, but for the record: I'm not criticizing the current situation.) > > >> Problems with the current approach >> >> ========================== >> >> The current approach has (IMO) several problems: >> >> * When loading the "Fixes.sql" file, even in a production environment, >> produces a truckload of ERRORs, because the schema may contain some or all >> of the fixes in the file, leading to aborted transactions >> >> * Making adjustments in multiple places can lead to incompleteness on >> either side >> >> * Fixes.sql is loaded twice; before and after loading the stored >> procedures, making it unsuitable for some kinds of fixes >> >> * In order to change types or parameter lists of stored procedures, they >> need to be dropped before being (re)created, resulting in DROP TYPE and >> DROP FUNCTION all over the place. >> >> Especially the high number of ERRORs in the log file when creating a new >> database is disconcerting to new users. Myself, I'm running into the DROP >> TYPE and DROP FUNCTION more and more often as my refactorings for 1.5 and >> 1.4-mc reach deeper into the system. >> >> Additional requirements >> >> ================= >> >> * Next to solving the problems above, some users have expressed the >> desire to extend LedgerSMB locally. However, the way we currently upgrade >> the system removes all customizations (at least from 1.2->1.3->1.4; maybe >> not 1.4->1.5). If our changed approach could help solve this requirement, >> all the better. >> >> * The solution chosen preferably works with a strategy of branching and >> merging as we do in the development cycle. >> >> Agreed on all the above. >> > > Good. Then we have good reasons to change :-) > > >> Possible solutions >> >> ============= >> >> One thing that I think we should do is separate development from >> production when loading the database. That is to say: when creating a new >> database, that is a different step than upgrading a production database >> which is in turn again very different from working on a development >> database. Our processes should work to provide the best for each. >> >> >> I am not sure I understand what you mean by this. Do you mean that we >> need an extra build step? I would not be opposed to that. But if that >> isn't what you mean, could you be more specific? >> > > Well, other than the standard Perl steps "copy to blib" and "manify POD", > we don't really have build steps of our own. I'm indeed proposing that we > have a build step. I envision it to be something like this: > > * Create PostgreSQL database > * Play Sqitch scripts > * Dump schema from the database > > and then consider the dumped schema a build artifact which doesn't need > version control. This step would be executed before creating a distribution > archive. > > The pre-distribution step should -if we go with Sqitch- also include > compilation of a Sqitch bundle, so we don't distribute development > progress, branches, or whatever that might be in the Sqitch repository > (which is - as far as I understand it - stored in the development tree). > > > Proposed solution to the duplicate change problem >> >> ===================================== >> >> In the past we have been talking about adding a "build" step to >> LedgerSMB. This step could be used to produce a single schema file for >> quickly setting up a new database (company). This schema file would be a >> build artifact and no longer be version controlled itself. It'd be the >> result of a full schema creation step, including running the Fixes.sql file. >> >> Additionally, this step could be used to deliver an up-to-date >> doc/database/ directory with current database documentation. >> >> I like that idea. >> >> While this step may feel inhibiting for development, one thing I'm >> thinking is that we may not need to require this step to be executed on a >> development system, except for when testing the production deployment. >> >> And testing the build system. >> > > Exactly. I think that the Travis-CI tests (c/sh)ould include the > generation of the schema and testing off of that. > > >> Proposed solution to the slew of errors from Fixes.sql >> >> ======================================= >> >> There are actually a number of solutions here, as I see it, all of them >> revolving around the idea that every schema change should be applied once. >> Basically, I see 2 categories of solutions: >> >> 1. Do it ourselves >> >> 2. Re-use the work of others >> >> The benefit of (1) is that we get full control and no further >> dependencies for development or production. However, the downside is that >> we get to do all the thinking and problem solving as well. >> >> In both categories I see similar solutions available: >> >> a. Numbered changes >> >> b. Named changes >> >> and in (1) I see a solution that's not available in category (2): >> >> c. Use PostgreSQL EXTENSIONs >> >> As for category (2), I haven't looked too far around yet, but I did find >> sqitch (http://sqitch.org/); Sqitch offers a command line tool for the >> development workflow. Additionally, it provides facilities for deploying >> only the necessary changes with releases *and* it provides an API which we >> can use to upgrade the database from one (patch) release to another. >> >> PG Extensions are definitely in the "reuse work of others" category but >> the system solves a different problem than sqitch. >> > > This is an interesting difference in views and we should explore it a bit. > > >> Indeed the two together could be a very powerful combination. >> >> PG Extensions would allow us to ship db components (whether table sets, >> etc) in a versioned way. There is likely to be a learning curve here and >> it adds a major build dependency for people installing from our source >> packages. But it provides some really nice features including versioning >> and dependency tracking/enforcement. >> >> What PG Extensions would allow us to do would be to effectively ship >> versioned pieces of our db schema and have an orderly way of upgrading them. >> > > Ok. So far, I'm not sure I see the difference with Sqitch, apart from the > fact that Sqitch is something we'd have to depend on and extensions are > built into PostgreSQL. One thing where I can see a difference - but we > wouldn't be using it, is my current estimation - is managing *external* > dependencies. But with Sqitch, we can manage internal dependencies, so, if > that's what we're using extensions for, what would be your envisioned > difference? > > >> For things like stored procedure sections, we could always drop the >> extension and load a new version. But it would make it very clean. And we >> could even break up parts of the db into modules that could be >> independently versioned. >> > >> Getting this tooling to work right is not a minor project however. It >> definitely cannot be done in a minor release because of the additional >> dependencies, and I think we'd need a policy against adding extensions to >> core in major versions. However it would also provide a nice platform for >> third party integration and reuse as well. >> >> Each of our modules could be an extension and separately versioned. >> > > Ok. I like the technical side of that, but I'm not sure we need that at > this point. I *do* see value for this in terms of optional modules. Say the > way we used to have Budgetting as an extension in 1.3 (although it solves > only 20% of the problem, because hooking up Budgetting into the > web-application was actually a bigger problem than getting the tables > installed). > > For branching and merging, the extensions would have a number specific to >> development head and could be re-numbered during the build process. For >> modules all we really need to do is drop extension/create extension if the >> version number has changed. >> >> We'd probably want to reserve a number series for "next in version >> series." I am thinking that using 999 would be a good one, so 1.4.999 >> would be "whatever next comes in 1.4" and this would allow us to install >> and run out of the development tree. >> > > Hmmm. Sqitch - like version control systems - does this by itself (in casu > by generating a hash, like Git and other DVCSes). > >> As for sqitch, I have no experience with it yet. It's supposed to work >> well with branching and merging. One thing it *does* do is integrate with >> version control tools and it means to integrate with the project's >> repository. Ideally when merging and branching, no additional processing is >> required to integrate the changes from branches. However, I'm not exactly >> sure that's what happens (given >> https://groups.google.com/forum/#!searchin/sqitch-users/merge/sqitch-users/GXqgt7nJ_1k/Vvg-r1HOEqMJ) >> but I think the referenced link is about reordering changes which already >> have been partially deployed. What I like about Sqitch is that it >> integrates with the VC system, but is VC system agnostic, some of us can >> use it with Git while others can keep using Hg-git as they currently do. >> >> Ok so if we ship our storage schema as an extension (or group of >> extensions) we still need to have ordered versioning going on. >> > > Well, it seems to me that that's what Sqitch does, *unless* you have a > database which is running some development version. From tag to tag (which > would be from release to release), ordering doesn't seem to matter. The > problem the page seems to refer to - but I should probably subscribe to the > mailing list and ask this quention - a database which is left at an > intermediary state. > > I am thinking that sqitch may allow us to do the grunt work of the schema >> change scripts as a part of the build phase (we have to strip out >> begin/commit statements since create/alter extension adds those), but it >> seems like it would be very helpful in managing that side. >> > > Yea. Maybe this too is a question to ask Sqitch's mailing list. > >> What I *am* sure about is that (1) will be completely change-oriented. >> *Maybe* we can use sqitch with a the current module-based source code >> organization. >> >> If we go with extensions we don't need to. We just bump the version >> number on each change of a module and keep a catalog of what is required >> (and tests to verify of course). We can then drop/create extensions as >> needed for the procedural modules, and we can generate migration scripts >> for the for the storage extension(s). The big difference is that instead >> of one db version we would have a bunch of components that were versioned >> and likely a requirement that these match on login. >> > > Hmm. I'm somehow that doesn't sound like a really attractive outlook: more > things that (apparently) can go wrong. > >> If we're going to go with Sqitch, I don't think it's a good idea to >> switch just then and there, but test it to build up some experience and >> then choose a well defined point in time to start using it. >> >> Agreed. Maybe start on plugins? >> > > Actually, I was thinking of starting with 1.5 *after* it's been upgraded > from 1.4? However, that would mean that *everybody* would need to migrate > through 1.5.0, because that's the moment we'll move to Sqitch? (Or we could > choose 1.6 or ...?) > > >> So, there are 2 proposals here. What do you say? >> >> I don't see how we can get away from doing some things ourselves, so I >> guess the question is what and where to reuse other work. >> >> So my preference would be sqitch and pg extensions. In my mind this >> would require a two-step build process for dbs: >> >> 1. Build the schema extensions using sqitch >> 2. Install all pg extensions using make. >> > > Ok. Is this on releases only, or with our day-to-day work on the tree? > > My view on using Sqitch for everything in our schema would be to replace > the current schema loading with a call to Sqitch's API. Of course we have > to do that ourselves, but it's not in the category that I'd call "do > something ourselves", because all we have to do is create the code to > delegate to "the other project's code". But figuring out *how* to do that, > yes, that's our work. > > The first is a developer-only task (sort of like running YACC is when >> building PostgreSQL -- not even required when installing from source >> packages) so there isn't a user-visible dependency there. >> >> The second though does impose some user dependencies in that pg_config >> and other pg development tools must be installed before the build can be >> done. Of course for those installing debs, these would be done when >> building debs rather than when installing them. >> > > If we use sqitch only, we also need it as a dependency, both for > developers and for end users. However, we probably don't need to learn our > users about it, if we simply use it as a library. > > So that is my preference and why. >> > Thanks for your response! > > With my reaction above, I'm wondering how to get to a next step. We > probably need to be practical in some sense and just start *somwhere*. But > the question would be *where* and *with what* -- in the sense that the > *what* question seems to be a choice between (as we now know them) (a) > Sqitch or (b) Sqitch+Extensions. > > > Looking forward to further discussion, > > -- > Bye, > > Erik. > > http://efficito.com -- Hosted accounting and ERP. > Robust and Flexible. No vendor lock-in. > -- Bye, Erik. http://efficito.com -- Hosted accounting and ERP. Robust and Flexible. No vendor lock-in.
------------------------------------------------------------------------------
_______________________________________________ Ledger-smb-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/ledger-smb-devel
