Re: Normalization question

2005-12-10 Thread Rod Heyd
Thanks for you responses!  This list has proven to be for useful to pick
pick the brains of other DBA's.

I think we are probably going to go the route of splitting the different
pieces of data into seperate columns, while keeping the original product_id,
since we have a lot of third party applications that reference that column
for data identification purposes, although not to specifically find the
information that is overloaded within the field.

Thanks!


Normalization question

2005-12-09 Thread Rod Heyd
Hi Everyone,

I've got a little debate that maybe you can all help me settle.

I have three tables that I need to join, but there are performance problems
with the joins due to some misunderstandings of my predecessor about what's
needed to join the tables efficiently.

Here's the setup.

t1 has a unique key defined on it, call it command_id, t1 has a 1 to many
relationship with t2.
t2 has t1's command_id as a foreign key constraint, plus a set of one or
more instrument_id's.
Each command results in one or more instruments taking data.

The commanding elements defined here are then set to our instrument to
aquire the data.
When the data comes back it is organized into t3 by command_id and
instrument_id.
So the primary key on t3 looks like this: command_id_instrument_id.

So, now I need to write a query that looks at what was commanded in t1 and
t2 and then look for
any missing data that has not yet been received.

So, I've got a query that looks something like this:

SELECT  stuff

FROM
t1
JOIN
t2
ON
   t1.command_id = t2.command_id

LEFT JOIN
  t3
ON
t3.data_id = concat(t1.command_id,'_',t2.instrument_id)

Now, I think everyone is going to see immediately that the join on t3 is
going to have absolutely horrible performance,
the join condition here can't take advantage of any indexes since the string
function pretty much destroys any hope of that. To make matters worse, the
left join is a total killer.


So my suggestion to solve the performance bottleneck is to add two columns
to t3,
command_id and instrument_id, and create a combined index on the command_id
and instrument_id
columns.

the join condition on t3 then becomes:

LEFT JOIN
 t3
ON
t3.command_id = t1.command_id
AND
t3.instrument_id = t2.instrument_id

This works beautifully!  The performance of this new join condition is about
480 times faster than the original.

Here's the rub.  Since there is a unique data_id that already exists which
combines the information in both
command_id and instrument_id keys, I'm being told by our seasoned software
developer that I am violating classic database design rules against
redundant data.

In my opinion, this is a minor violation at best.  We have a good reason
for wanting to identify each data segment with our originally defined
data_id, but this is not strictly a requirement on the database, it's more
of an agreed upon convention that we are going to use to identify the data
segments and distribute them to our consumers.  From a database stand point,
the only requirement is that the data_id be unique.  It could be anything as
far as the database is concerned, as long as the data_id remains unique, it
doesn't matter that it may be overloaded with some duplicate information.
Any more experienced DBA's than I have an opinion on this?

Thanks!


Re: Normalization question

2005-12-09 Thread SGreen
Rod Heyd [EMAIL PROTECTED] wrote on 12/09/2005 11:01:38 AM:

 Hi Everyone,
 
 I've got a little debate that maybe you can all help me settle.
 
 I have three tables that I need to join, but there are performance 
problems
 with the joins due to some misunderstandings of my predecessor about 
what's
 needed to join the tables efficiently.
 
 Here's the setup.
 
 t1 has a unique key defined on it, call it command_id, t1 has a 1 to 
many
 relationship with t2.
 t2 has t1's command_id as a foreign key constraint, plus a set of one or
 more instrument_id's.
 Each command results in one or more instruments taking data.
 
 The commanding elements defined here are then set to our instrument to
 aquire the data.
 When the data comes back it is organized into t3 by command_id and
 instrument_id.
 So the primary key on t3 looks like this: command_id_instrument_id.
 
 So, now I need to write a query that looks at what was commanded in t1 
and
 t2 and then look for
 any missing data that has not yet been received.
 
 So, I've got a query that looks something like this:
 
 SELECT  stuff
 
 FROM
 t1
 JOIN
 t2
 ON
t1.command_id = t2.command_id
 
 LEFT JOIN
   t3
 ON
 t3.data_id = concat(t1.command_id,'_',t2.instrument_id)
 
 Now, I think everyone is going to see immediately that the join on t3 is
 going to have absolutely horrible performance,
 the join condition here can't take advantage of any indexes since the 
string
 function pretty much destroys any hope of that. To make matters worse, 
the
 left join is a total killer.
 
 
 So my suggestion to solve the performance bottleneck is to add two 
columns
 to t3,
 command_id and instrument_id, and create a combined index on the 
command_id
 and instrument_id
 columns.
 
 the join condition on t3 then becomes:
 
 LEFT JOIN
  t3
 ON
 t3.command_id = t1.command_id
 AND
 t3.instrument_id = t2.instrument_id
 
 This works beautifully!  The performance of this new join condition is 
about
 480 times faster than the original.
 
 Here's the rub.  Since there is a unique data_id that already exists 
which
 combines the information in both
 command_id and instrument_id keys, I'm being told by our seasoned 
software
 developer that I am violating classic database design rules against
 redundant data.
 
 In my opinion, this is a minor violation at best.  We have a good 
reason
 for wanting to identify each data segment with our originally defined
 data_id, but this is not strictly a requirement on the database, it's 
more
 of an agreed upon convention that we are going to use to identify the 
data
 segments and distribute them to our consumers.  From a database stand 
point,
 the only requirement is that the data_id be unique.  It could be 
anything as
 far as the database is concerned, as long as the data_id remains unique, 
it
 doesn't matter that it may be overloaded with some duplicate 
information.
 Any more experienced DBA's than I have an opinion on this?
 
 Thanks!

You are not duplicating data by referring to objects by their id values. 
The false-normalization through the use of the composite key on t3 was a 
mistake. By obscuring the actual relationships between t1, t2, and t3, 
your original designer broke one of the cardinal rules of designing a 
relational database. Each foreign key should point to at least  (and 
usually at most) one row on a single table. His foreign key pointed to any 
row on either table. That was a bad design decision that created serious 
performance bottlenecks.  If you have the chance to fix his design of t3 
to keep the separate columns for each FK, please do it. If you want to 
keep the composite key for historical reasons, I don't see why that can't 
happen but you should stop using it as the primary key for the table. 

The drawback to composite keys is that they could possibly refer to more 
than one thing.  Assume for a moment that you are creating a composite key 
out of two text-type fields. Also imagine that you are compositing those 
fields with a _ character (as in your example). This pair ('testme_a', 
'secondhalf') and this pair ('testme', 'a_secondhalf') would both evaluate 
to 'testme_a_secondhalf'. We lose our 1:1 correspondence between key and 
parent.

Obviously, my example was contrived and using just numbers can avoid this 
kind of behavior but it illustrated why it is just not correct to 
construct foreign keys. Foreign keys are SUPPOSED to contain values that 
actually match something in the other table. Using them does not violate 
the no duplicates rule your friend describes and can be used to enforce 
relational integrity.

Just so you have it on record (IMHO): You are correct and the other DBA 
should read up about modern relational databases and modern theories of 
normalization.

Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine

Re: Normalization question

2005-12-09 Thread Michael Stassen

Rod Heyd wrote:

Hi Everyone,

I've got a little debate that maybe you can all help me settle.

I have three tables that I need to join, but there are performance problems
with the joins due to some misunderstandings of my predecessor about what's
needed to join the tables efficiently.

Here's the setup.

t1 has a unique key defined on it, call it command_id, t1 has a 1 to many
relationship with t2.
t2 has t1's command_id as a foreign key constraint, plus a set of one or
more instrument_id's.
Each command results in one or more instruments taking data.

The commanding elements defined here are then set to our instrument to aquire
the data. When the data comes back it is organized into t3 by command_id and 
instrument_id. So the primary key on t3 looks like this:

command_id_instrument_id.


Yuck!


So, now I need to write a query that looks at what was commanded in t1 and t2
and then look for any missing data that has not yet been received.

So, I've got a query that looks something like this:

SELECT  stuff
FROM t1 JOIN t2
ON t1.command_id = t2.command_id
LEFT JOIN t3
ON t3.data_id = concat(t1.command_id,'_',t2.instrument_id)

Now, I think everyone is going to see immediately that the join on t3 is
going to have absolutely horrible performance,
the join condition here can't take advantage of any indexes since the string
function pretty much destroys any hope of that. To make matters worse, the
left join is a total killer.


I think the LEFT JOIN isn't that big a deal, if the proper index is there and 
usable.  You can easily find out by comparing the speed of your query below 
against the speed of the same query without LEFT.



So my suggestion to solve the performance bottleneck is to add two columns to
t3, command_id and instrument_id, and create a combined index on the
command_id and instrument_id columns.


Yes, exactly.  The combined index should be UNIQUE.


the join condition on t3 then becomes:

LEFT JOIN t3
ON t3.command_id = t1.command_id AND t3.instrument_id = t2.instrument_id

This works beautifully!  The performance of this new join condition is about
480 times faster than the original.


Yes, this is the way this should be done.

Here's the rub. Since there is a unique data_id that already exists which 
combines the information in both command_id and instrument_id keys, I'm being

told by our seasoned software developer that I am violating classic
database design rules against redundant data.


It is true that you now have redundant data.  Clearly, you do not need both the 
two new columns, command_id and instrument_id, and the old column, 
command_id_instrument_id.  Redundant data is a violation of classic database 
design rules, so one or the other has to go if you want to follow the rules.


Which should go?  Well, the old column, command_id_instrument_id, combines the 
answer to two questions, Which command? and Which instrument?, into one 
column.  That is also also a violation of the classic database design rules, 
and a really bad idea.  It leads to precisely the sort of problem you are trying 
to fix.  It also makes it difficult to find the rows in t3 which belong to a 
particular command, or to a particular instrument.  Those queries require string 
matching, and the latter could not use an index.  If you want to follow the 
rules, drop the old column.



In my opinion, this is a minor violation at best.  We have a good reason
for wanting to identify each data segment with our originally defined
data_id, but this is not strictly a requirement on the database, it's more
of an agreed upon convention that we are going to use to identify the data
segments and distribute them to our consumers.  From a database stand point,
the only requirement is that the data_id be unique.  It could be anything as
far as the database is concerned, as long as the data_id remains unique, it
doesn't matter that it may be overloaded with some duplicate information.
Any more experienced DBA's than I have an opinion on this?


The strictly correct solution is to replace the old, broken column with your two 
new columns.  You can either make the combined index on the two new columns the 
PRIMARY KEY, or you can make an AUTO_INCREMENT primary key and define the 
combination of the two columns as UNIQUE.  It is then trivial to


  SELECT CONCAT(command_id, '_', instrument_id) ...

when you want to display the t3 id using the agreed upon convention.

In short, your seasoned software developer is right to want to follow the rules, 
but the rules dictate replacing the old column with the two new columns.



Thanks!


Michael

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



RE: normalization question

2002-10-21 Thread Artem Koltsov
Hi!
I don't understand problem. Car can belong to only one Dealer, and one Dealer can have 
many Cars, therefore DealerID should be in Car table. Also Car can be only one Model, 
but there are many Cars the same Model. It looks quite normal to put ModelID into Car 
table, and I don't see any possible problems. Of course if there are any special 
requirements for the database, they have to be incorporated into the model.
Am I missing your point?

Artem

-Original Message-
From: [EMAIL PROTECTED] [mailto:speters;metromls.com]
Sent: Monday, October 21, 2002 5:43 PM
To: [EMAIL PROTECTED]
Subject: normalization question


I'm working on a design for a database that is leaving me with some
normalization issues that I havent had to deal with before. For example, if
im trying to store
information about car dealerships I end up with the following situation:

I want to store the cars in stock at each physical dealership, with certain
attributes.

So, I have a table Dealership as follows:
Dealer_ID
Name
Address
Etc.

Then there is a Car_Make table (for things like Chevy, Hyundai):
Car_Make_ID
Make_Name
...

And a Model table (for things like Elantra, Neon, etc):
Model:
Model_ID
Car_Make_ID
Model_Name
...

And a table to describe the cars themselves: (one row per car)
Car:
VIN_number(vehicle identification number, what the DMV uses)
Model_ID
other attributes
...

Now, to associate a specific car with a dealer, I want to put Dealer_ID in
the Cars table, but then the Cars table has multiple parent tables,
specifically
Model  Dealership

I'm sure that this is far from normal, and will lead to programming
problems,
but i dont see what to do about it.

Any advice would be greatly appreciated.

 thanks,
sean peters
[EMAIL PROTECTED]


sql, query



-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

 
Attention:
Any views expressed in this message are those of the individual sender, except where 
the message states otherwise and the sender is authorized to state them to be the 
views of any such entity.

The information contained in this message and or attachments is intended only for the 
person or entity to which it is addressed and may contain confidential and/or 
privileged material. Any review, retransmission, dissemination or other use of, or 
taking of any action in reliance upon, this information by persons or entities other 
than the intended recipient is prohibited. If you received this in error, please 
contact the sender and delete the material from any system and destroy any copies.

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




normalization question

2002-10-21 Thread speters
I'm working on a design for a database that is leaving me with some
normalization issues that I havent had to deal with before. For example, if
im trying to store
information about car dealerships I end up with the following situation:

I want to store the cars in stock at each physical dealership, with certain
attributes.

So, I have a table Dealership as follows:
Dealer_ID
Name
Address
Etc.

Then there is a Car_Make table (for things like Chevy, Hyundai):
Car_Make_ID
Make_Name
...

And a Model table (for things like Elantra, Neon, etc):
Model:
Model_ID
Car_Make_ID
Model_Name
...

And a table to describe the cars themselves: (one row per car)
Car:
VIN_number(vehicle identification number, what the DMV uses)
Model_ID
other attributes
...

Now, to associate a specific car with a dealer, I want to put Dealer_ID in
the Cars table, but then the Cars table has multiple parent tables,
specifically
Model  Dealership

I'm sure that this is far from normal, and will lead to programming
problems,
but i dont see what to do about it.

Any advice would be greatly appreciated.

 thanks,
sean peters
[EMAIL PROTECTED]


sql, query



-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




Re: normalization question

2002-10-21 Thread Richard Clarke
Except in the case that cars aren't held at the dealers and any one car can
be purchased from a range of dealers.
In which case you will need a seperate table like,
Dealer_Car_Map (dealer_id,car_id).

Ric.

- Original Message -
From: Artem Koltsov [EMAIL PROTECTED]
To: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Sent: Monday, October 21, 2002 9:17 PM
Subject: RE: normalization question


Hi!
I don't understand problem. Car can belong to only one Dealer, and one
Dealer can have many Cars, therefore DealerID should be in Car table. Also
Car can be only one Model, but there are many Cars the same Model. It looks
quite normal to put ModelID into Car table, and I don't see any possible
problems. Of course if there are any special requirements for the database,
they have to be incorporated into the model.
Am I missing your point?

Artem

-Original Message-
From: [EMAIL PROTECTED] [mailto:speters;metromls.com]
Sent: Monday, October 21, 2002 5:43 PM
To: [EMAIL PROTECTED]
Subject: normalization question


I'm working on a design for a database that is leaving me with some
normalization issues that I havent had to deal with before. For example, if
im trying to store
information about car dealerships I end up with the following situation:

I want to store the cars in stock at each physical dealership, with certain
attributes.

So, I have a table Dealership as follows:
Dealer_ID
Name
Address
Etc.

Then there is a Car_Make table (for things like Chevy, Hyundai):
Car_Make_ID
Make_Name
...

And a Model table (for things like Elantra, Neon, etc):
Model:
Model_ID
Car_Make_ID
Model_Name
...

And a table to describe the cars themselves: (one row per car)
Car:
VIN_number(vehicle identification number, what the DMV uses)
Model_ID
other attributes
...

Now, to associate a specific car with a dealer, I want to put Dealer_ID in
the Cars table, but then the Cars table has multiple parent tables,
specifically
Model  Dealership

I'm sure that this is far from normal, and will lead to programming
problems,
but i dont see what to do about it.

Any advice would be greatly appreciated.

 thanks,
sean peters
[EMAIL PROTECTED]


sql, query



-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail
[EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php


Attention:
Any views expressed in this message are those of the individual sender,
except where the message states otherwise and the sender is authorized to
state them to be the views of any such entity.

The information contained in this message and or attachments is intended
only for the person or entity to which it is addressed and may contain
confidential and/or privileged material. Any review, retransmission,
dissemination or other use of, or taking of any action in reliance upon,
this information by persons or entities other than the intended recipient is
prohibited. If you received this in error, please contact the sender and
delete the material from any system and destroy any copies.

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php





-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




Re: NEWBIE: normalization question/help/lost :)

2002-09-09 Thread Brent Baisley

Any experienced database person will probably tell you that reaching 
fifth normal form usually makes your database cumbersome to work with. 
Though in theory is the best setup.
It looks like you are trying to create a cascading structure, which I 
think is one of the harder database designs to grasp. The most common 
cascading database design is a discussion board. There appears to be a 
lot of similarities between what you are trying to do and a discussion. 
I would look up a tutorial on creating a basic discussion board, there 
should be a bunch of them on the web.
You are right that you are starting down the wrong road in your design. 
Any database design should be able to expand without hitting a limit 
because of design.
In your example, you seem to be basically classifying almost all your 
data into a code and a description. Each entry may have a parent entry.

This table structure would give you essentially unlimited depth:
RecordID | ParentID | DataID

ParentID contains the RecordID of another entry in the same table. 
DataID is a link to another table that contains the human readable form 
of the data. A NULL ParentID entry indicates you are at the first 
level or A level in your case. You may want to use something else than 
NULL in the ParentID column to represent no parent. I usually use a 
zero since this is typical a an unsigned int column. A NULL then denotes 
some sort of error.
To compile all your information you create a loop to traverse up or down 
the tree.

Hope that gets you started.

On Saturday, September 7, 2002, at 05:56 PM, Thoenen, Peter Mr. EPS 
wrote:

 After a year of doing database programming, try to actually normalize.  
 Read
 a couple online tutorials (phpbuilder, devshed) and the book mysql 2nd
 edition and it just can't seem to stick.  Understanding the 
 concept/theory
 but freeze up every time I try to implement.  Trying to go from 
 0NF--3NF
 (though I can't even really get to 1NF :) ... not worried about 4NF or 
 5NF.
--
Brent Baisley
Systems Architect
Landover Associates, Inc.
Search  Advisory Services for Advanced Technology Environments
p: 212.759.6400/800.759.0577


-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




Re: NEWBIE: normalization question/help/lost :)

2002-09-08 Thread Amer Neely

 After a year of doing database programming, try to actually normalize.  Read
 a couple online tutorials (phpbuilder, devshed) and the book mysql 2nd
 edition and it just can't seem to stick.  Understanding the concept/theory
 but freeze up every time I try to implement.  Trying to go from 0NF--3NF
 (though I can't even really get to 1NF :) ... not worried about 4NF or 5NF.
 Anybody want to help walk me through this or point to any GOOD FULL EXAMPLES
 of normalizing a database.

I highly reccomend 'Database Design for Mere Mortals' by Michael J.
Hernandez
ISBN: 0-201-69471-9

or

'Data  Databases: Concepts in Practice' by Joe Celko
ISBN: 1-55860-432-4

They have both added much to my understandinf of RDBMs and information
management.
-- 
/* All outgoing email scanned by Norton Antivirus 2002 */
Amer Neely, Softouch Information Services
W: www.softouch.on.ca
E: [EMAIL PROTECTED]
V: 519.438.5887
Perl | PHP | MySQL | CGI programming for all data entry forms.
We make web sites work!


-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




NEWBIE: normalization question/help/lost :)

2002-09-07 Thread Thoenen, Peter Mr. EPS

Hello,

After a year of doing database programming, try to actually normalize.  Read
a couple online tutorials (phpbuilder, devshed) and the book mysql 2nd
edition and it just can't seem to stick.  Understanding the concept/theory
but freeze up every time I try to implement.  Trying to go from 0NF--3NF
(though I can't even really get to 1NF :) ... not worried about 4NF or 5NF.

Currently I have the following 0NF table
--  ID 
ID  |   ColA|   ColB|   ColC|
1   |   A   |   A.A |   A.A.A   |   
2   |   A   |   A.A |   A.A.B   |
3   |   A   |   A.B |   A.B.A   |
4   |   A   |   A.B |   A.B.B   |
5   |   A   |   A.A |   A.A.B   |
6   |   B   |   B.A |   B.A.A   |
{ . repeat , values in ColA|ColB|ColC are dependent on each other
but arbitrary in assignment .. could pick 1 X | X.M | X.M.X followed by 2 |
X | X.M | X.M.X if you really want .. only thing unique is ID}

Where:
ID(unique)-ColA (one-one)
ColA--ColB (one-Many)
ColB--ColC(one-Many)

(basically ColB is a subset of values dependent on the value chosen for ColA
and Colc is a subset of values dependent on the value chosen for ColB, ID
the only unique value)

Now as this goes on and on repeating, will start to see plenty of redundancy
in ColA, ColB, ColC (tickets get assigned the same values to ColA, ColB,
ColC) and I am lost as where to go from here to be honest.



**

The following is my WRONG attempt to normalize :)

For the information allow to be chosen for ColA, ColB, ColC I have
additional tables defined

-- ColA --
VALUE
A
B
C
D
...

then a table is created for EACH value in ColA 

A- (repeat this for B, C, D, )
VALUE
A.A
A.B
A.C
A.D
...

and then additional I create a table for EACH value in TABLE A, B, C,
...

---A.A- (repeat for all All A.A, A.B... to include B.A, B.B...,
arb_value.sub_value..
VALUE
A.A.A
A.A.B
A.A.C

Form based and not pretty :)

basically:

SELECT value FROM ColA

List values
Choose ColA_value

SELECT value FROM $ColA_value   #var holding table name for subset of A

List ColA_value
Choose ColA_value_sub

SELECT value FROM $ColA_value_sub   #var holding table name from subset
of A.A

List ColA_value_sub
Choose ColA_value_sub_sub

INSERT INTO id(ColA, ColB, ColC) VALUES (ColA_value, ColA_value_sub,
ColA_value_sub_sub)

This only works since I have about 20 tables  I am pretty sure this is
wrong ... might as well use flat text files if I am going to do it this way.


*


Anybody want to help walk me through this or point to any GOOD FULL EXAMPLES
of normalizing a database.

Cheers,

-Peter

##
Peter Thoenen - Systems Programmer
Commercial Communications
Camp Bondsteel, Kosovo
##

Stumbled Upon...heh (Score:5, Funny) /.
by $carab on 23:00 23 August 2002 (#4131637)

ForensicTec officials said they stumbled upon the military networks about
two months ago, while checking on network security for a private-sector
client.

Someone new to a Dvorak probably tried to type in lynx
http://www.google.com; but instead got nmap -v -p 1-1024 -sS -P0 army.mil
-T paranoid.

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail mysql-unsubscribe-##L=##[EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




data normalization question

2002-09-05 Thread speters

Does anyone have or know of any good books that discuss or teach normal
forms in great detail. I'd like to say that
im pretty good with 2NF, i'm looking for info on more advanced
normalizations.

A specific normalization question:
if i have three tables, call them A, B, C is it ever proper to have key
relations that end up like:
A-B-C-A

where X-Y means that records in X are associated with records in Y through
a foreign key type relation.

This seems implicitly wrong, but data doesnt always seem to model itself in
a tree like structure.

sean peters
[EMAIL PROTECTED]

mysql, query



-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




RE: Normalization question

2002-06-04 Thread Daren Cotter


I have a question about the setup of the tables in my database.

In my members table, I store a lot of info about demographics, such as
marital status, income, etc. The way I'm doing this is the enum type. Is
it better to use this, or would it be better to create separate tables
for each demographic I want to store? Advantages of this would be no
additional programming required when I want to list the # of members who
have selected each demographic...plus I could add new items any time I
wanted (i.e. I want to add widowed to the list of marital statuses).

What is the correct way to do this?




-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] 
Sent: Tuesday, June 04, 2002 6:24 PM
To: Daren Cotter
Subject: Re: Normalization question 


Your message cannot be posted because it appears to be either spam or
simply off topic to our filter. To bypass the filter you must include
one of the following words in your message:

sql,query

If you just reply to this message, and include the entire text of it in
the reply, your reply will go through. However, you should first review
the text of the message to make sure it has something to do with MySQL.
Just typing the word MySQL once will be sufficient, for example.

You have written the following:

I have a question about the setup of the tables in my database.

In my members table, I store a lot of info about demographics, such as
marital status, income, etc. The way I'm doing this is the enum type. Is
it better to use this, or would it be better to create separate tables
for each demographic I want to store? Advantages of this would be no
additional programming required when I want to list the # of members who
have selected each demographic...plus I could add new items any time I
wanted (i.e. I want to add widowed to the list of marital statuses).

What is the correct way to do this?



-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




RE: Normalization question

2002-06-04 Thread Cal Evans

I won't purport to tell you the 'correct' way to do it but I will tell you
how I would do it.

When I have information (marital status) I create a 'domain table'. I have 2
rules for most of my domain tables.

1: they end in the word Type. (maritalStatusType)
2: they have 3 fields:
 The PK (maritalStatusTypeID)
 description varChar(100)
 endDate date

So in this case
maritalStatusTypeID description endDate
1   Married null
2   Single  null
3   Divorcednull
4   Widowed null
5   Shacked Up  2002-06-03

Now, in your members table you store the maritalStatusTypeID as a FK.

then a simply

select member.*,
   maritalStatusType.description
  from member left join maritalStatusType on member.maritalStatusTypeID =
maritalStatusType.maritalStatusTypeID


If you use the domain table to populate a select box (which I do) then the
endDate provides a mechanism whereby you can stop if from appearing in the
select box without deleting the record. (because, as we all know, deleting a
record from a domain table would leave orphaned FKs in the members table and
orphans are bad mojo.)

Anyhow, this has worked for me in the past.

=C=


*
* Cal Evans
* Journeyman Programmer
* Techno-Mage
* http://www.calevans.com
*


-Original Message-
From: Daren Cotter [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, June 04, 2002 6:38 PM
To: [EMAIL PROTECTED]
Subject: RE: Normalization question



I have a question about the setup of the tables in my database.

In my members table, I store a lot of info about demographics, such as
marital status, income, etc. The way I'm doing this is the enum type. Is
it better to use this, or would it be better to create separate tables
for each demographic I want to store? Advantages of this would be no
additional programming required when I want to list the # of members who
have selected each demographic...plus I could add new items any time I
wanted (i.e. I want to add widowed to the list of marital statuses).

What is the correct way to do this?




-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, June 04, 2002 6:24 PM
To: Daren Cotter
Subject: Re: Normalization question


Your message cannot be posted because it appears to be either spam or
simply off topic to our filter. To bypass the filter you must include
one of the following words in your message:

sql,query

If you just reply to this message, and include the entire text of it in
the reply, your reply will go through. However, you should first review
the text of the message to make sure it has something to do with MySQL.
Just typing the word MySQL once will be sufficient, for example.

You have written the following:

I have a question about the setup of the tables in my database.

In my members table, I store a lot of info about demographics, such as
marital status, income, etc. The way I'm doing this is the enum type. Is
it better to use this, or would it be better to create separate tables
for each demographic I want to store? Advantages of this would be no
additional programming required when I want to list the # of members who
have selected each demographic...plus I could add new items any time I
wanted (i.e. I want to add widowed to the list of marital statuses).

What is the correct way to do this?



-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php



-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




RE: normalization question

2001-08-19 Thread Daren Cotter

The data wouldn't need to be stored for any longer than two to three months,
so that shouldn't be a problem...what about my query to get all members that
have not read the mailing, is that possible (assuming I don't use the
reverted logic you were talking about).



Daren Cotter
CEO, InboxDollars.com
http://www.inboxdollars.com
(507) 382-0435

-Original Message-
From: Rene Churchill [mailto:[EMAIL PROTECTED]]
Sent: Sunday, August 19, 2001 7:34 AM
To: Daren Cotter
Cc: [EMAIL PROTECTED]
Subject: Re: normalization question



Hi Daren,

How long do you need to store this information?  The normalized table
that you describe is simple and easy to index.  Having several million
rows in the table won't bother MySQL.  So is it sufficient to keep the
data around for a month and then prune it out of the database?  Long
term records could be dumped out into text logfiles for safe keeping.

Another possiblity is to invert your storage logic.  By that, I mean
insert a row in the cross-reference table for every user that has
NOT read the email.  Then when they read it, remove that row from
the table.  This will automatically prune down the table size as
more and more of your readers check their email.

I would HIGHLY suggest using the DELAYED option on that size of an
insert.

You may not want to go this logic inversion route because of the havoc
it will play with other queries.  For example, since any user that
does NOT have a matching row in the cross-reference table is assumed
to have read the message, any new user joining you will automatically
be assumed to have read all messages to date.

Rene

Daren Cotter wrote:
 I have a table which needs to store which emails each member has read and
 had their account credited for. Currently, I have a field read_array
 longtext in my main members table, and each time a member reads an email,
 it simply concats to the end of this array. For example, they read email
 288, their read array is:

 288

 They read 298, the read_array is:

 288||298

 Using this method (which is far from normalization, I know), I have been
 able to accomplish the two major things I need to be able to accomplish:

 1) Insure that no member can get credit for reading the same mailing twice
 (check their current read_array, using ereg())
 2) Run a query to see how many members have read each mailing (WHERE
 read_array LIKE '%mail_id%')

 I know that both of these tasks would be easier if I created a new table,
 and stored the member_id, mail_id, and date, and the table would then be
 normalized...however, I send approximately one new mailing per day, and
this
 would mean 100,000 new rows in this table every day. Can somone offer me
 expert advice as to which method is better? Using an array that will
grow
 to unlimited size, or using a table that is going to grow much larger
every
 day?

 Also, as a deterrant to switching to the normalized method...I frequently
 run queries to do the following:

 *) After I send a mailing, a week later, I send the mailing again to all
 members who did not confirm reading the email. My Query is like:
 select member_id, (more) FROM members WHERE read_array not like
'%mail_id%'

 Is there any possible way I could accompish this task in one query if I
had
 this process normalized?

--
Rene Churchillhttp://www.vtwebwizard.com
Internet Consulting   802-244-5151
Specializing in Web Programming

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail
[EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php


-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




normalization question

2001-08-18 Thread Daren Cotter

I have a table which needs to store which emails each member has read and
had their account credited for. Currently, I have a field read_array
longtext in my main members table, and each time a member reads an email,
it simply concats to the end of this array. For example, they read email
288, their read array is:

288

They read 298, the read_array is:

288||298

Using this method (which is far from normalization, I know), I have been
able to accomplish the two major things I need to be able to accomplish:

1) Insure that no member can get credit for reading the same mailing twice
(check their current read_array, using ereg())
2) Run a query to see how many members have read each mailing (WHERE
read_array LIKE '%mail_id%')

I know that both of these tasks would be easier if I created a new table,
and stored the member_id, mail_id, and date, and the table would then be
normalized...however, I send approximately one new mailing per day, and this
would mean 100,000 new rows in this table every day. Can somone offer me
expert advice as to which method is better? Using an array that will grow
to unlimited size, or using a table that is going to grow much larger every
day?

Also, as a deterrant to switching to the normalized method...I frequently
run queries to do the following:

*) After I send a mailing, a week later, I send the mailing again to all
members who did not confirm reading the email. My Query is like:
select member_id, (more) FROM members WHERE read_array not like '%mail_id%'

Is there any possible way I could accompish this task in one query if I had
this process normalized?

TIA,



Daren Cotter


-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php