On Thu, Dec 27, 2001 at 04:01:57PM -0600, Gawain wrote:
> I've got a 3500 record database cataloging an art collection. The 
> data was originally assembled (by others) in Access. I've imported it 
> into MySQL to make it available on the artist's web site: 
> <http://maryloureifsnyder.com>.
> 
> The data was formatted in Access as one big table. About 800 of the 
> pieces in the collection are works in a series. In the artist's 
> catalog there's about 20 books, each containing 40 or so pages. The 
> original creators of the database developed a decimal numbering 
> scheme and simply duplicated most of the data in the records 
> describing these works. For instance, records 129.00 to 129.43 
> contain mostly the same data about the title, condition, storage, 
> etc. with only minimal (if any) changes.
> 
> My question is, is it worth the effort to normalize this data? If so, 
> what would the best method be?

Whether it's worth the effort depends on how much effort is involved 
and whether you can afford the time. (I'm assuming that you are 
volunteering your time. If not, then it depends on whether your 
client can afford your time.) Normalized tables are the best 
solution, but not necessarily the optimal solution. How much time is 
involved depends on the condition of the data now and how good you 
are at writing the SQL statements and procedural routines necessary to 
clean up the data and normalize it. Almost all clean up can be done in 
SQL, if you have the skill. If you don't, creating procedural routines 
takes more time.

I can't tell from your description how normalized (or denormalized) 
your data is. Without looking at the data, I can't tell you how 
clean it is. And I don't know how much time can be devoted to it. 
Since I don't know any of the values for the optimization problem 
parameters, I can't tell you whether it is worth it, or how far to 
carry it. I can tell you that if the data was entered without any 
data validation, and if the column data types were not carefully 
chosen, then the data will need a lot of cleaning before you start 
normalizing. 

The best method for normalization is the only method. Apply the 
normal forms in order. First normal form is better than no 
normalization. Second normal form is better than first normal form. 
And so on. I've never seen anyone take it past Boyce-Codd normal form. 
Take it that far if you can.

Bob Hall

---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Reply via email to