At 5:46am -0400 on Wed, 17 Sep 2008, Jeffrey Santos wrote:
> I'm developing an application
> that will require information from various sources.  Since what 
> might be considered required information about those sources will 
> vary (occasionally dramatically), I'm unsure as to the correct 
> design for the database tables.  In other words I'm left with (I 
> believe) two choices:
> 
> 1.   Create a source table that will contain, among other things, 
> every possible field for a source type and sort out what is 
> considered required information at the application level.

> If possible, maybe a general "conventional wisdom" statement 
> would greatly help my education on these matters!

This is a common pitfall of noobs to data modeling.  The idea is to try
to think of everything at the forefront, which is almost always an
impossible task, or to think of a minimal case to jump start the coding
process, and then add "model components" later as necessary.

The second pitfall is keeping model logic in the application.  This
violates two similar principles: Single-Point-of-Authority and
Don't-Repeat-Yourself.

You are using a database so that it can maintain the structure and
integrity of your data.  There is absolutely no other compelling reason
to use a DB.  (If you didn't need integrity and structure, you'd use a
simpler and mucho faster flat file.)  Let the DB do its job and be the
single-point-of-authority.  The application should certainly do things
"the right way", putting data where it needs to go, but it should not be
the application's /responsibility/ to keep data integrity and structure.
 If you rely on your application to maintain your structure, you presume
that your programmers are perfect and will think of every little detail.
 (They're not and they won't.  I promise.)

As a general technology, databases receive a lot of work so that
applications developers don't have to sweat the little details.  Like
making sure that every user in a table has a last name.  Like making
sure the user typed a 0 instead of an O.  Like defining constraints so
that developers don't have to make sure an account has enough money to
make a withdraw.  All they need know is that it didn't go through.

The other principle of DRY also holds true.  At the point you have your
application try to maintain data constraints, you will inevitably have
lots of repeated or similar code to maintain similar data cases.  That's
a different kind of nightmare.  Similarly, with the DB, it's silly to
define multiple columns for similar data.  That's spreadsheet think.
Like col_toyota, col_ford, col_chevy, col_lexus, col_buick.  No.  Make a
single column as a foreign key to another table.

Some keywords to use with Google:

normalize
normalization
foreign keys
foreign key constraints
innodb

Wikipedia is often a good starting point.

Once you've learned the concept, I can't stress enough that you should
normalize, normalize, normalize.

> 2.   Create a few source tables detailing the required information about
> each source type.  The only way I can think of doing this is something along
> the lines of:

> and then pull only the appropriate data from the database at query time
> using JOINs.

This is closer to a better approach.  That said, be more specific about
the problem you're trying to solve and folks may be better able to
provide useful advices.

Kevin

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to