At 5:46am -0400 on Wed, 17 Sep 2008, Jeffrey Santos wrote: > I'm developing an application > that will require information from various sources. Since what > might be considered required information about those sources will > vary (occasionally dramatically), I'm unsure as to the correct > design for the database tables. In other words I'm left with (I > believe) two choices: > > 1. Create a source table that will contain, among other things, > every possible field for a source type and sort out what is > considered required information at the application level.
> If possible, maybe a general "conventional wisdom" statement > would greatly help my education on these matters! This is a common pitfall of noobs to data modeling. The idea is to try to think of everything at the forefront, which is almost always an impossible task, or to think of a minimal case to jump start the coding process, and then add "model components" later as necessary. The second pitfall is keeping model logic in the application. This violates two similar principles: Single-Point-of-Authority and Don't-Repeat-Yourself. You are using a database so that it can maintain the structure and integrity of your data. There is absolutely no other compelling reason to use a DB. (If you didn't need integrity and structure, you'd use a simpler and mucho faster flat file.) Let the DB do its job and be the single-point-of-authority. The application should certainly do things "the right way", putting data where it needs to go, but it should not be the application's /responsibility/ to keep data integrity and structure. If you rely on your application to maintain your structure, you presume that your programmers are perfect and will think of every little detail. (They're not and they won't. I promise.) As a general technology, databases receive a lot of work so that applications developers don't have to sweat the little details. Like making sure that every user in a table has a last name. Like making sure the user typed a 0 instead of an O. Like defining constraints so that developers don't have to make sure an account has enough money to make a withdraw. All they need know is that it didn't go through. The other principle of DRY also holds true. At the point you have your application try to maintain data constraints, you will inevitably have lots of repeated or similar code to maintain similar data cases. That's a different kind of nightmare. Similarly, with the DB, it's silly to define multiple columns for similar data. That's spreadsheet think. Like col_toyota, col_ford, col_chevy, col_lexus, col_buick. No. Make a single column as a foreign key to another table. Some keywords to use with Google: normalize normalization foreign keys foreign key constraints innodb Wikipedia is often a good starting point. Once you've learned the concept, I can't stress enough that you should normalize, normalize, normalize. > 2. Create a few source tables detailing the required information about > each source type. The only way I can think of doing this is something along > the lines of: > and then pull only the appropriate data from the database at query time > using JOINs. This is closer to a better approach. That said, be more specific about the problem you're trying to solve and folks may be better able to provide useful advices. Kevin -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]