Good morning. Christian, I am really glad you followed up and did not leave the topic dead, as I was beginning to suspect you had.
Your reply is, as was your last, exactly the type of discussion I hoped to raise when I first posted. Thank you for staying on topic as relates to the proposal itself, I sincerely regret any and all insults communicated due to getting carried away on several oocasions. (I did post a message apologizing unconditionally to you Steffen yesterday, it does not appear to have made it through.) On Thu, Jul 08, 2010 at 03:44:02PM +0200, Christian Boos wrote: > "Prior to version 3.6.19, SQLite did not support foreign key constraints" > (http://www.sqlite.org/cvstrac/wiki?p=ForeignKeyTriggers) Granted, this is true, but joins, views and triggers are still supported, and triggers can emulate the keys and cascades for the critical relationships in these older versions. See: http://www.sqlite.org/cvstrac/wiki?p=ForeignKeyTriggers > As for views, I don't see an use case yet, but if there's one, there > wouldn't be problems to use them, except for updates, which is also > not something consistently supported. The use case for views in data modelling without update support is absolutely present! In this context, you could have Work Efforts at the bottom ("ticket" being a row in effort type). The view would then have the name Tickets, and join these systems together so they can be reached in a uniform way from the users perspective (for reporting). CREATE VIEW Tickets AS SELECT * FROM WorkEfforts WHERE effort_type = [ticket type id] JOIN [more relevant information]; And this view would then present a collection of parts from the underlying model, and provide a list of known Ticket "resources" in the work hierarchy. >From the middleware perspective, you use the views to gather data if it makes sense (many engines cache it), or go directly to the lower levels if that is the more efficient approach to whatever operation you are programming. Consider the word "Ticket". It really only makes sense to the submitter, "here is your ticket number, get in line". From all other perspectives after its submission, it represents a unit of work. Either it is a feature request that must be considered (then discarded, or implemented which is all work), or patch (review/commit/reject it), support request (help submitter) etc. This entity also, unambigiously, is handled by a *WORK*flow system, does that not sound like a hint?;-) Ticket is a *way to look* at a work effort (is a VIEW of work effort..) The word "Ticket" is used only because it corresponds to a logical entity in the *front-end* system. This point is important to note: The Ticket concept belongs in the front-end system, - as layering of software dictates - should be translated downwards. You *cannot* model this *front-end* concept *directly* in the *storage layer* as you do, without serious side effects. Anyway with this approach, you are free to refactor the underlying data model at any point in time when you hit a performance problem. You will have to change the middleware to go with it, but since all reporting that the *enduser* does, is against the VIEWs, we can implement entirely new parallel reporting paradigms while leaving the old in, supporting both, without impacting existing code (yet to be ported), runtime performance or locally defined reports. All you have to do, is make sure the views present the same information after you change the underlying model. > don't mind a little reinvention and abstracting away from the > database, if this can help modularity, code reuse and overall > simplification of the system. I think that trying to map the model It is exactly these goals++ that will be acheived by normalizing the model, as I have argued previously, and strengthen further below. > (addicted?/clean? didn't get it, really). The question is, can you If you do it the normalized way, nobody can **EVER** blame your code again, because it is the wizards in the database engine that are not doing their job well enough, and it can be fixed with a better performing engine. If you do it non-relational, the performance problem exists in your code and must be handled for all eternity. Thus you have given the wizards inside the database heroin, and they will not recover and help you solve any problems. You are left to solve them all in your own middleware, while the workforce within the database is sleeping drugged-down on the sidewalk. Changing engine will only help you for an irrelevant period of time. Did that clear it up? ;-) > come up with a clean "relational" design which improves upon that, > which is normalized, easy to use, fast, scalable, yet extensible in > terms of adding/removing custom properties at run-time? Then I'm all > ears because it's also exactly what I'm looking for. Not a "perfect" solution, of course, because of the special requirements on sqlite versions and the complexity of the system. But *signifficant* improvements can be made, as I hope this post will show more clearly.. The exact workings of the custom field subsystems, I will supply input on down the line, when a few other key parts of the design has been done. Its exact final scope of work is quite unclear to me at the moment. > if we would fold everything into custom properties, but that would > most certainly be detrimental to performance, due to the very simple > key/value model we use there. So we need something a bit smarter, > but still as flexible. What you have failed to do, is to discover the fundamental properties of the entities you are modelling. Both the ticket and the milestone represents work units in a hierarchy, as is favoured by some project management disciplines, even if you *still* think they don't ;-) > The model you proposed in the RelationalModel page is only *very > remotely* related to what Trac needs, so maybe you should try to Keep in mind this statement you made, I think you may be surprised ;-) This model will help especially in a multi-project scenario. It provides *FULL* abstraction for *ALL* kinds of hierarchies (via Association pattern). None of you realize the *first* implications of this model at all, it seems to me?? You can subdivide this hierarchy forever downwards. Permissions can be programmed to apply upwards or downwards in different types of hierarchies present. Notice, that you can actually describe a whole Trac environment as unit of work with a start-end date (and its milestone/tickets in a hierarchy below it). And as such, this model provides a full, flexible, proven mechanism that in the future could support an import of, say, 10 normal environments into a single relational structure, and still maintain all information about where it came from in the first place. New relationships could now be added, stating that a ticket at one place depends on the completion of a milestone in what used to be a different environment, or must be concurrent with two milestones from two different environments. And when you take this even further, consider how two Tracs could exchange a part of this hierarchy, say Trac "1.0" contacts an old .12 env elsewhere on the globe via XMLRPC. It could then create a local work unit to represent an external ticket/milestone that will be synced periodically. The model can handle this simply with a new row in the Effort Type to classify the entity as "External work". You could now use this entity as a dependency *anywhere* in your own work breakdown, control it with a local testing or other workflow etc. Obviously, a plugin can register a row in the Effort Types, and for example provide support for an external ticket in another issue tracker alltogether: INSERT INTO WorkEffortTypes (name) VALUES ('External Work'); -- id 1 INSERT INTO WorkEffortTypes (parent_id, name) VALUES (1, 'Trac .11 XMLRPC'); INSERT INTO WorkEffortTypes (parent_id, name) VALUES (1, 'Trac .11 .db'); INSERT INTO WorkEffortTypes (parent_id, name) VALUES (1, 'Trac .12 XMLRPC'); INSERT INTO WorkEffortTypes (parent_id, name) VALUES (1, 'Trac .12 .db'); INSERT INTO WorkEffortTypes (parent_id, name) VALUES (1, 'Bugzilla'); INSERT INTO WorkEffortTypes (parent_id, name) VALUES (1, 'Redmine'); Each of these rows would have a corresponding piece of code, core or plugin, that takes care of the related business logic. This logic is free to sync in an n-level hierarchy of work below itself from the external source, and this is completely transparent to the rest of the system. It would appear as local work; read-only or read-write, pull/push sync, marked explicity or not, or whatever the code is made and configured to do. > bridge the gap. For an example use case, explain how you would > handle the addition of a custom field "version-fixed" for storing a > reference to the version in which a ticket has been fixed, assuming > the user don't want to abuse the milestone field for such a thing, > like everyone else does ;-) No problem! A version is *easily* envisioned as a work effort: You start at one point in time and finish it later, and it sure is a lot of work in between (and the exact same thing is true for any other part of repo, *and* of the repo itself). So in practice you would need some new rows in Assoc Type to declare the existence of such relationships in the system: INSERT INTO WorkEffortAssocTypes (name) VALUES ('Repository relations'); -- 1 INSERT INTO WorkEffortAssocTypes (parent_id, name) VALUES (1, 'Fixed in'); INSERT INTO WorkEffortAssocTypes (parent_id, name) VALUES (1, 'Planned for'); These rows must be directly handled by a corresponding piece of middleware code, core or plugin. Often the child rows fixed/planned are represented in code as a subclass of a repository relation class, which is in turn is subclassed from a workeffortassoctype class (not always). Only this code knows what the relationships means in terms of fields and forms to display, what to do etc. It must "manually" check that these relations are only created so they point from/to a work effort with a valid type (this is the cost of a generic model). In the specific use case you request explained, the corresponding rows to check against might be: INSERT INTO WorkEffortTypes (name) VALUES ('Repositories'); -- 1 INSERT INTO WorkEffortTypes (parent_id, name) VALUES (1, 'Subversion'); INSERT INTO WorkEffortTypes (parent_id, name) VALUES (2, 'Trunk'); INSERT INTO WorkEffortTypes (parent_id, name) VALUES (2, 'Branch'); INSERT INTO WorkEffortTypes (parent_id, name) VALUES (4, 'Release'); INSERT INTO WorkEffortTypes (parent_id, name) VALUES (5, 'Version'); The relation would then be created by inserting a row into the WorkEffortAssocs table, pointing from one work unit to another, and specifying the ID acquired from the previously shown insert as assoc_type (the "fixed in" id). This code would leave a NULL in the assoc_rule column, since it is not relevant for this type of relationship at all. from_eff_id contains the ID of a work effort declared with an effort_type as ticket (not shown). to_eff_id contains the ID of a work effort declared as an effort_type from the list immediately above. Q.E.D. ;-) > Trac, at least not for tickets. At most, there's a date associated > to the closing time of a ticket, but a ticket can be closed/reopened > many times... And by the way, all changes to the tickets are > versioned, something which is also not at all present in your > simplistic model. This information belongs in the relationship between a work unit and a state entity within the definition of a workflow, or some such, slightly depending on circumstances. > The need for versioning of changes should ideally be taken into > account automatically by the storage system. Maybe that's another > hint that perhaps a relational database is not the best suited for > the job. You should really have a look at the > TighterSubversionIntegration and the WhySQLite pages for prior > discussion on this topic. I have made a note to re-read these pages in the near future. For SQLite, I submit the following proof-of-concept: CREATE TRIGGER delete_to_revision AFTER DELETE ON storage BEGIN INSERT INTO revisions (storage_id, column1, column2, ..) VALUES (OLD.rowid, OLD.column1, OLD.column2, 'delete', JULIANDAY('NOW')); END; That will push a row to revision history when deleted. It will work in all engines for full revisioning. Its performance implications have yet to be determined across platforms, but sqlite does support ATTACH, so it could be established as a revision.db alongside the main .db to reduce runtime impact due to exclusive lock during revisioning. The approach is acceptable, imho (if it passes criteria). Another optimization is of course to do the same with custom fields, custom.db and have (support) 3 all together. > There's much more to it than just the database, the database is just > a persistence layer, but not at all the level where we place the > logic in it. Take for example the permission model: before 0.10, > permissions were all placed in the database, now starting with 0.11, > you can have any kind of permission policy implemented by plugins, > and the old db-based permission policy is just one among them, which > could even be turned off. And this type of flexibility is what makes Trac great, and the reason why I did decide to deploy it in production ;-) All I want is more of it! > things before committing to a model. Once again, I don't know yet > the final form this will take, in the end this will be a simple, > normalized, scalable model, I hope a solid foundation to build upon, > so maybe even something that could satisfy your database design > taste, who knows? I hope, maybe, I managed to communicate at least some points about the proposed model. I hope it helps. Have a nice day. Terje -- You received this message because you are subscribed to the Google Groups "Trac Development" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/trac-dev?hl=en.
