Re: set type and normalize

2005-04-08 Thread Marilyn Davis
On Thu, 7 Apr 2005, Brent Baisley wrote:

Thank you Brent.

 As mentioned in the manual you specified, there are limits to how many 
 items you can have in a set, among other issues. My biggest issue 
 against using sets is that you need alter the table structure whenever 
 you want to change the set.  Usually I would want the user to be able 
 to add, modify or delete list items. You do NOT want a user being able 
 to alter the table structure.
 The three table layout gives you basically unlimited scalability, 
 easier and quicker modification, dynamic sorting, more flexible 
 searching and no character restrictions.

I can see the unlimited scalability, quicker modification, and no
character restrictions.  Dynamic sorting?  Do you mean there's a
sort that the three table layout can do that can't be done with the
set type?  Or a search?

I'm sorry to be so dense, but can you give me an example?

Thank you again.

Marilyn Davis

 
 
 On Apr 7, 2005, at 5:20 PM, Marilyn Davis wrote:
 
  Hi MySQL people.
 
  I'm trying to understand the pros and cons of the set column type.
 
  Here is an excerpt from the article:
 
  http://dev.mysql.com/tech-resources/articles/mysql-set-datatype.html
 
 
Why You Shouldn't Use SET
 
  Third, using a set means your data is not normalized. In our above
  example, we are tracking a person's interests for a hypothetical
  dating site. In a normalized schema, there should be three tables: one
  for the person, one for all possible interests, and one that links a
  person to their particular interests.
 
  The above example is:
 
CREATE TABLE set_test(
  rowid INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
  myset SET('Travel','Sports','Dancing','Fine Dining')
);
 
  I guess I'm not understanding the point here.  What practical
  advantage is there to creating three tables instead of using a set
  column type?
 
  Thank you for any help.
 
  Marilyn Davis
 
 
 
 
  -- 
  MySQL General Mailing List
  For list archives: http://lists.mysql.com/mysql
  To unsubscribe:
  http://lists.mysql.com/[EMAIL PROTECTED]
 
 
 

-- 


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: set type and normalize

2005-04-08 Thread Michael Stassen
On Apr 7, 2005, at 5:20 PM, Marilyn Davis wrote:
Hi MySQL people.
I'm trying to understand the pros and cons of the set column type.
Here is an excerpt from the article:
http://dev.mysql.com/tech-resources/articles/mysql-set-datatype.html
  Why You Shouldn't Use SET
Third, using a set means your data is not normalized. In our above
example, we are tracking a person's interests for a hypothetical
dating site. In a normalized schema, there should be three tables: one
for the person, one for all possible interests, and one that links a
person to their particular interests.
The above example is:
  CREATE TABLE set_test(
rowid INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
myset SET('Travel','Sports','Dancing','Fine Dining')
  );
I guess I'm not understanding the point here.  What practical
advantage is there to creating three tables instead of using a set
column type?
Thank you for any help.
Marilyn Davis
A slightly more realistic example might be
  CREATE TABLE member
  (  id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
 name CHAR(30),
 interests SET('Travel','Sports','Dancing','Fine Dining')
  );
with some rows like
  id  nameinterests
   1  Joe Travel, Sports
   2  Sue Dancing, Fine Dining
The 3 table alternative (the normalized version) would be
  CREATE TABLE member
  (  id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
 name CHAR(30),
 INDEX (name)
  );
  CREATE TABLE interest
  (  id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
 category CHAR(30),
 INDEX (category)
  );
  CREATE TABLE member_interests
  (  member_id INT UNSIGNED NOT NULL,
 interest_id INT UNSIGNED NOT NULL,
 PRIMARY KEY (member_id, interest_id),
 INDEX (interest_id)
  );
Some rows:
member
  id  name
   1  Joe
   2  Sue
interest
  id  category
   1  Travel
   2  Fine Dining
   3  Dancing
   4  Sports
member_interests
  member_id  interest_id
 1  1
 1  4
 2  2
 2  3
Despite the apparent increase in complexity of the 3-table version, it 
has a number of advantages.  Here are a few:

* Retrieving  the list of interests
  SET:
DESCRIBE member interests;
  3 TABLES:
SELECT category FROM interest;
In the SET case, you have to parse the output to find the interests, 
while in the 3 TABLES case you simply get a list of interests.

* Adding a new interest
  SET:
ALTER TABLE member CHANGE interests interests 
SET('Travel','Sports','Dancing','Fine Dining', 'Literature');

  3 TABLES:
INSERT INTO interest (category) VALUES ('Literature');
In the SET case, you have to change the table structure!  Mysql makes a 
temporary table with the new column definition, copies all the data 
from the member table to this temporary table, then replaces member 
with the temporary table.  You cannot edit (INSERT, DELETE, UPDATE) the 
member table till this is done.  (See 
http://dev.mysql.com/doc/mysql/en/alter-table.html)  Also, while 
mysql should correctly convert the interests data to the new column 
definition, you should make a backup first, just in case.  In the 3 
TABLE case, on the other hand, this is a simple, quick insert.

* Finding members with a particular interest:
  SET:
SELECT id, name FROM member WHERE FIND_IN_SET('Sports', interests);
  3 TABLES:
SELECT m.id, m.name
FROM member m
JOIN member_interest mi ON m.id = mi.member_id
JOIN interest i ON mi.interest_id = i.id
WHERE i.category = 'Sports';
In the SET case, the query *looks* simpler, but mysql cannot use an 
index on member.interests to narrow the search.  That is, mysql has to 
examine every single row of the member table (a full table scan), 
looking at the value in the interests column to see if 'Sports' is in 
there.  That's slow, and it gets worse as the table grows.  In the 3 
TABLE case, however, the index on interest.category will be used to 
find precisely those rows with the correct interest, without looking at 
any other rows.  That's fast, and scales better as the table grows.

If your table is small and the load on your server is low, the SET type 
may work fine.  If this is a table of a 100 or so friends and 
relatives, for example, to be used by you (and perhaps your spouse), 
the limitations of the SET type may not be a problem.  As the table 
gets larger and/or the number of users goes up, however, those 
limitations will kill you.

Michael
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]


Re: set type and normalize

2005-04-08 Thread Brent Baisley
I may have been reaching a little on the dynamic sorting since some of 
what I had in mind entails adding more data to the categories. But at a 
minimum, you would be able to sort ascending or descending and possibly 
use an index on the sort depending on the query. Depending on how you 
join your tables in  the query, you would be able to replicate the main 
data so it appears under each category it is assigned to. Think of 
sorting/grouping.
You would be able to sort on the second word in each category if you 
wanted. You could also added more data to your categories, like a sort 
order field or a weighting field for relevance sorting. Since you are 
actually using a database table, you are not just stuck with using a 
word, you can link other data to the category and use it for 
calculations or sorting. You could even create subcategories, and sort 
on category/subcategory.

On Apr 8, 2005, at 12:25 PM, Marilyn Davis wrote:
On Thu, 7 Apr 2005, Brent Baisley wrote:
Thank you Brent.
As mentioned in the manual you specified, there are limits to how many
items you can have in a set, among other issues. My biggest issue
against using sets is that you need alter the table structure whenever
you want to change the set.  Usually I would want the user to be able
to add, modify or delete list items. You do NOT want a user being able
to alter the table structure.
The three table layout gives you basically unlimited scalability,
easier and quicker modification, dynamic sorting, more flexible
searching and no character restrictions.
I can see the unlimited scalability, quicker modification, and no
character restrictions.  Dynamic sorting?  Do you mean there's a
sort that the three table layout can do that can't be done with the
set type?  Or a search?
I'm sorry to be so dense, but can you give me an example?
Thank you again.
Marilyn Davis

On Apr 7, 2005, at 5:20 PM, Marilyn Davis wrote:
Hi MySQL people.
I'm trying to understand the pros and cons of the set column type.
Here is an excerpt from the article:
http://dev.mysql.com/tech-resources/articles/mysql-set-datatype.html
  Why You Shouldn't Use SET
Third, using a set means your data is not normalized. In our above
example, we are tracking a person's interests for a hypothetical
dating site. In a normalized schema, there should be three tables: 
one
for the person, one for all possible interests, and one that links a
person to their particular interests.

The above example is:
  CREATE TABLE set_test(
rowid INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
myset SET('Travel','Sports','Dancing','Fine Dining')
  );
I guess I'm not understanding the point here.  What practical
advantage is there to creating three tables instead of using a set
column type?
Thank you for any help.
Marilyn Davis

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:
http://lists.mysql.com/[EMAIL PROTECTED]


--

--
Brent Baisley
Systems Architect
Landover Associates, Inc.
Search  Advisory Services for Advanced Technology Environments
p: 212.759.6400/800.759.0577
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]


Re: set type and normalize

2005-04-08 Thread Marilyn Davis
On Fri, 8 Apr 2005, Brent Baisley wrote:

 I may have been reaching a little on the dynamic sorting since some of 
 what I had in mind entails adding more data to the categories. But at a 

I see.  Thank you.

 minimum, you would be able to sort ascending or descending and possibly 
 use an index on the sort depending on the query. Depending on how you 
 join your tables in  the query, you would be able to replicate the main 
 data so it appears under each category it is assigned to. Think of 
 sorting/grouping.
 You would be able to sort on the second word in each category if you 
 wanted. You could also added more data to your categories, like a sort 
 order field or a weighting field for relevance sorting. Since you are 
 actually using a database table, you are not just stuck with using a 
 word, you can link other data to the category and use it for 
 calculations or sorting. You could even create subcategories, and sort 
 on category/subcategory.
 

Yes, very powerful and flexible.  But I think it's overkill for my
situation.

And thank you Michael!  I made hard-copy of your great explanation and
studied it hard.

I'm feeling more confident about going ahead with the set type for my
table.  

I have been using an enum for the 5 possible values we have in a
field, each one flagging a different process.  But very occasionally,
two values might be appropriate, and some other considerations must be
taken in account to choose the process on the fly.

So I'm upgrading from enum to set on the field.

Users never see or know about the elements of the SET and certainly
never can add one.  It's nothing interesting like dancing and
traveling.

So, I'm feeling pretty confident that it's the right thing for us.

Thank you for the great help!  I mean *really* great help.

Marilyn Davis


On Fri, 8 Apr 2005, Michael Stassen wrote:

 
 On Apr 7, 2005, at 5:20 PM, Marilyn Davis wrote:
 
  Hi MySQL people.
 
  I'm trying to understand the pros and cons of the set column type.
 
  Here is an excerpt from the article:
 
  http://dev.mysql.com/tech-resources/articles/mysql-set-datatype.html
 
 
Why You Shouldn't Use SET
 
  Third, using a set means your data is not normalized. In our above
  example, we are tracking a person's interests for a hypothetical
  dating site. In a normalized schema, there should be three tables: one
  for the person, one for all possible interests, and one that links a
  person to their particular interests.
 
  The above example is:
 
CREATE TABLE set_test(
  rowid INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
  myset SET('Travel','Sports','Dancing','Fine Dining')
);
 
  I guess I'm not understanding the point here.  What practical
  advantage is there to creating three tables instead of using a set
  column type?
 
  Thank you for any help.
 
  Marilyn Davis
 
 A slightly more realistic example might be
 
CREATE TABLE member
(  id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
   name CHAR(30),
   interests SET('Travel','Sports','Dancing','Fine Dining')
);
 
 with some rows like
 
id  nameinterests
 1  Joe Travel, Sports
 2  Sue Dancing, Fine Dining
 
 The 3 table alternative (the normalized version) would be
 
CREATE TABLE member
(  id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
   name CHAR(30),
   INDEX (name)
);
 
CREATE TABLE interest
(  id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
   category CHAR(30),
   INDEX (category)
);
 
CREATE TABLE member_interests
(  member_id INT UNSIGNED NOT NULL,
   interest_id INT UNSIGNED NOT NULL,
   PRIMARY KEY (member_id, interest_id),
   INDEX (interest_id)
);
 
 Some rows:
 
 member
id  name
 1  Joe
 2  Sue
 
 interest
id  category
 1  Travel
 2  Fine Dining
 3  Dancing
 4  Sports
 
 member_interests
member_id  interest_id
   1  1
   1  4
   2  2
   2  3
 
 Despite the apparent increase in complexity of the 3-table version, it 
 has a number of advantages.  Here are a few:
 
 * Retrieving  the list of interests
SET:
  DESCRIBE member interests;
 
3 TABLES:
  SELECT category FROM interest;
 
 In the SET case, you have to parse the output to find the interests, 
 while in the 3 TABLES case you simply get a list of interests.
 
 * Adding a new interest
SET:
  ALTER TABLE member CHANGE interests interests 
 SET('Travel','Sports','Dancing','Fine Dining', 'Literature');
 
3 TABLES:
  INSERT INTO interest (category) VALUES ('Literature');
 
 In the SET case, you have to change the table structure!  Mysql makes a 
 temporary table with the new column definition, copies all the data 
 from the member table to this temporary table, then replaces member 
 with the temporary table.  You cannot edit (INSERT, DELETE, UPDATE) the 
 member table till this is done.  (See 
 

set type and normalize

2005-04-07 Thread Marilyn Davis
Hi MySQL people.  

I'm trying to understand the pros and cons of the set column type.

Here is an excerpt from the article:

http://dev.mysql.com/tech-resources/articles/mysql-set-datatype.html


  Why You Shouldn't Use SET

Third, using a set means your data is not normalized. In our above
example, we are tracking a person's interests for a hypothetical
dating site. In a normalized schema, there should be three tables: one
for the person, one for all possible interests, and one that links a
person to their particular interests.

The above example is:

  CREATE TABLE set_test(
rowid INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
myset SET('Travel','Sports','Dancing','Fine Dining')
  );

I guess I'm not understanding the point here.  What practical
advantage is there to creating three tables instead of using a set
column type?

Thank you for any help.

Marilyn Davis




-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: set type and normalize

2005-04-07 Thread Brent Baisley
As mentioned in the manual you specified, there are limits to how many 
items you can have in a set, among other issues. My biggest issue 
against using sets is that you need alter the table structure whenever 
you want to change the set.  Usually I would want the user to be able 
to add, modify or delete list items. You do NOT want a user being able 
to alter the table structure.
The three table layout gives you basically unlimited scalability, 
easier and quicker modification, dynamic sorting, more flexible 
searching and no character restrictions.

On Apr 7, 2005, at 5:20 PM, Marilyn Davis wrote:
Hi MySQL people.
I'm trying to understand the pros and cons of the set column type.
Here is an excerpt from the article:
http://dev.mysql.com/tech-resources/articles/mysql-set-datatype.html
  Why You Shouldn't Use SET
Third, using a set means your data is not normalized. In our above
example, we are tracking a person's interests for a hypothetical
dating site. In a normalized schema, there should be three tables: one
for the person, one for all possible interests, and one that links a
person to their particular interests.
The above example is:
  CREATE TABLE set_test(
rowid INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
myset SET('Travel','Sports','Dancing','Fine Dining')
  );
I guess I'm not understanding the point here.  What practical
advantage is there to creating three tables instead of using a set
column type?
Thank you for any help.
Marilyn Davis

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:
http://lists.mysql.com/[EMAIL PROTECTED]


--
Brent Baisley
Systems Architect
Landover Associates, Inc.
Search  Advisory Services for Advanced Technology Environments
p: 212.759.6400/800.759.0577
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]