I'm working on a web app for a quality control checklist. I already have a table set up, but I have a hunch that our model is sub-optimal and I could get some better performance.I'm hoping someone on this list can help me think clearly about how to express this efficiently in SQL.
Each checklist has dozens, sometimes hundreds of questions. Each question has between 2 and 10 possible answers. Each Question is a varchar string, and so is each answer. A completed checklist is when all of the questions are associated with one of its possible answers -- i.e., when one answer is chosen. Checklists are different for different purposes, and they can change over time. So to keep completed checklists from inadvertently changing when we want to have changes in new checklists, we have templates. Templates, Questions, and Answers are a mirror of Checklists, Questions, and Answers, and represent the 'current version' of the checklist. We don't do the checklists ourselves, but provide them for our clients, so that is another level of grouping going on. So the current table hierarchy looks like this - Clients - Templates - TemplateQuestions - TemplateQuestionAnswers - Checklists - ChecklistQuestions - ChecklistQuestionAnswers Because we don't want changes in the current template to 'go back in time' and change completed checklists, data is copied from Templates into Checklists when a user goes to start a new checklist. As you can guess, this creates a lot of duplication. In ChecklistQuestionAnswers, out of about a million answer rows, there are only 4,000 distinct answers. Of course, TemplatesQuestionAnswers has duplication too, but not as bad. So what I'm think I want to do is create a versioning system for checklist templates, so I can save on space by storing unique questions with unique sets of answers only once. That way, instead of duplicating text wholesale, I can just link a Checklist against a *version* of a Template, and then a checklist set is which answer was chosen for which question. Here's what I've sketched out so far. "A clients has many templates. A template has many revisions, but only one current revision. Each revision has many questions, and each question has many ( between 2 and 10 ) answers. Each Checklist relates to one Template. Each checklist has a set answers that indicate the answer select for each question in its version of the template." Questions /* all unique question wordings */ Questions.id Questions.question Answers /* all unique answer wordings. */ Answers.id Answers.answer Templates Templates.client_id /* relates to client table. */ Templates.template_name Templates.current_version /* this is related to TemplateVersions.version_number */ TemplateVersions /* A logical grouping of a set of questions and answers */ TemplateVersions.version TemplateVersions.template_id /* relates this version to a template. */ TemplateQuestions TemplateQuestions.template_version /* relates a question to a template version */ TemplateQuestions.question_id /* relates a unique question to this template version */ TemplateQuestions.id TemplateQuestionAnswers TemplateQuestionAnswers.template_question_id /* relates this answer to a particular template version question */ TemplateQuestionAnswers.answer_id /* relates the unique question to a unique answer */ TemplateQuestionAnswers.id Checklists Checklists.id Checklists.template_version /* relates this question to a template version -- associating this checklist to a client happens through this relationship */ ChecklistAnswers /* ( I might call this something other than 'Answers' since the lack of ChecklistQuestionAnswers breaks 'name symmetry' with TemplateQuestionAnswers ) */ ChecklistAnswers.checklist_id ChecklistAnswers.question_id ChecklistAnswers.answer_id The rub I'm getting hung up on is guaranteeing that ChecklistAnswers associates a proper question-and-answer pair -- the relationship that exists in the version of the Template that its Checklist parent is referencing. In other words, each row in ChecklistAnswers must 'mirror' a question_id from TemplateQuestions to one child question from TemplateQuestionAnswers, form the template_version in Checklists. I'm trying to think of how to do this and my thinking process short circuits here. This is really the 'deliverable' of the database -- a completed checklist -- so all the other templates and everything is sort of epiphenomenal or an abstraction of that. If I can't get this working, I've missed the whole point! This seems a *little* unwieldy, so I'm wondering if I'm making a solution whose complexity is not worth the space-savings I might get from implementing it. Also note, I've simplified this a bit. There are other dimensions of complexity, such as a category system for grouping questions for reporting, but I don't think we need to get into that here. -- "Computers are useless. They can only give you answers" -- Pablo Picasso -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql