Re: set type and normalize
On Thu, 7 Apr 2005, Brent Baisley wrote: Thank you Brent. As mentioned in the manual you specified, there are limits to how many items you can have in a set, among other issues. My biggest issue against using sets is that you need alter the table structure whenever you want to change the set. Usually I would want the user to be able to add, modify or delete list items. You do NOT want a user being able to alter the table structure. The three table layout gives you basically unlimited scalability, easier and quicker modification, dynamic sorting, more flexible searching and no character restrictions. I can see the unlimited scalability, quicker modification, and no character restrictions. Dynamic sorting? Do you mean there's a sort that the three table layout can do that can't be done with the set type? Or a search? I'm sorry to be so dense, but can you give me an example? Thank you again. Marilyn Davis On Apr 7, 2005, at 5:20 PM, Marilyn Davis wrote: Hi MySQL people. I'm trying to understand the pros and cons of the set column type. Here is an excerpt from the article: http://dev.mysql.com/tech-resources/articles/mysql-set-datatype.html Why You Shouldn't Use SET Third, using a set means your data is not normalized. In our above example, we are tracking a person's interests for a hypothetical dating site. In a normalized schema, there should be three tables: one for the person, one for all possible interests, and one that links a person to their particular interests. The above example is: CREATE TABLE set_test( rowid INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, myset SET('Travel','Sports','Dancing','Fine Dining') ); I guess I'm not understanding the point here. What practical advantage is there to creating three tables instead of using a set column type? Thank you for any help. Marilyn Davis -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] -- -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: set type and normalize
On Apr 7, 2005, at 5:20 PM, Marilyn Davis wrote: Hi MySQL people. I'm trying to understand the pros and cons of the set column type. Here is an excerpt from the article: http://dev.mysql.com/tech-resources/articles/mysql-set-datatype.html Why You Shouldn't Use SET Third, using a set means your data is not normalized. In our above example, we are tracking a person's interests for a hypothetical dating site. In a normalized schema, there should be three tables: one for the person, one for all possible interests, and one that links a person to their particular interests. The above example is: CREATE TABLE set_test( rowid INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, myset SET('Travel','Sports','Dancing','Fine Dining') ); I guess I'm not understanding the point here. What practical advantage is there to creating three tables instead of using a set column type? Thank you for any help. Marilyn Davis A slightly more realistic example might be CREATE TABLE member ( id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, name CHAR(30), interests SET('Travel','Sports','Dancing','Fine Dining') ); with some rows like id nameinterests 1 Joe Travel, Sports 2 Sue Dancing, Fine Dining The 3 table alternative (the normalized version) would be CREATE TABLE member ( id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, name CHAR(30), INDEX (name) ); CREATE TABLE interest ( id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, category CHAR(30), INDEX (category) ); CREATE TABLE member_interests ( member_id INT UNSIGNED NOT NULL, interest_id INT UNSIGNED NOT NULL, PRIMARY KEY (member_id, interest_id), INDEX (interest_id) ); Some rows: member id name 1 Joe 2 Sue interest id category 1 Travel 2 Fine Dining 3 Dancing 4 Sports member_interests member_id interest_id 1 1 1 4 2 2 2 3 Despite the apparent increase in complexity of the 3-table version, it has a number of advantages. Here are a few: * Retrieving the list of interests SET: DESCRIBE member interests; 3 TABLES: SELECT category FROM interest; In the SET case, you have to parse the output to find the interests, while in the 3 TABLES case you simply get a list of interests. * Adding a new interest SET: ALTER TABLE member CHANGE interests interests SET('Travel','Sports','Dancing','Fine Dining', 'Literature'); 3 TABLES: INSERT INTO interest (category) VALUES ('Literature'); In the SET case, you have to change the table structure! Mysql makes a temporary table with the new column definition, copies all the data from the member table to this temporary table, then replaces member with the temporary table. You cannot edit (INSERT, DELETE, UPDATE) the member table till this is done. (See http://dev.mysql.com/doc/mysql/en/alter-table.html) Also, while mysql should correctly convert the interests data to the new column definition, you should make a backup first, just in case. In the 3 TABLE case, on the other hand, this is a simple, quick insert. * Finding members with a particular interest: SET: SELECT id, name FROM member WHERE FIND_IN_SET('Sports', interests); 3 TABLES: SELECT m.id, m.name FROM member m JOIN member_interest mi ON m.id = mi.member_id JOIN interest i ON mi.interest_id = i.id WHERE i.category = 'Sports'; In the SET case, the query *looks* simpler, but mysql cannot use an index on member.interests to narrow the search. That is, mysql has to examine every single row of the member table (a full table scan), looking at the value in the interests column to see if 'Sports' is in there. That's slow, and it gets worse as the table grows. In the 3 TABLE case, however, the index on interest.category will be used to find precisely those rows with the correct interest, without looking at any other rows. That's fast, and scales better as the table grows. If your table is small and the load on your server is low, the SET type may work fine. If this is a table of a 100 or so friends and relatives, for example, to be used by you (and perhaps your spouse), the limitations of the SET type may not be a problem. As the table gets larger and/or the number of users goes up, however, those limitations will kill you. Michael -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: set type and normalize
I may have been reaching a little on the dynamic sorting since some of what I had in mind entails adding more data to the categories. But at a minimum, you would be able to sort ascending or descending and possibly use an index on the sort depending on the query. Depending on how you join your tables in the query, you would be able to replicate the main data so it appears under each category it is assigned to. Think of sorting/grouping. You would be able to sort on the second word in each category if you wanted. You could also added more data to your categories, like a sort order field or a weighting field for relevance sorting. Since you are actually using a database table, you are not just stuck with using a word, you can link other data to the category and use it for calculations or sorting. You could even create subcategories, and sort on category/subcategory. On Apr 8, 2005, at 12:25 PM, Marilyn Davis wrote: On Thu, 7 Apr 2005, Brent Baisley wrote: Thank you Brent. As mentioned in the manual you specified, there are limits to how many items you can have in a set, among other issues. My biggest issue against using sets is that you need alter the table structure whenever you want to change the set. Usually I would want the user to be able to add, modify or delete list items. You do NOT want a user being able to alter the table structure. The three table layout gives you basically unlimited scalability, easier and quicker modification, dynamic sorting, more flexible searching and no character restrictions. I can see the unlimited scalability, quicker modification, and no character restrictions. Dynamic sorting? Do you mean there's a sort that the three table layout can do that can't be done with the set type? Or a search? I'm sorry to be so dense, but can you give me an example? Thank you again. Marilyn Davis On Apr 7, 2005, at 5:20 PM, Marilyn Davis wrote: Hi MySQL people. I'm trying to understand the pros and cons of the set column type. Here is an excerpt from the article: http://dev.mysql.com/tech-resources/articles/mysql-set-datatype.html Why You Shouldn't Use SET Third, using a set means your data is not normalized. In our above example, we are tracking a person's interests for a hypothetical dating site. In a normalized schema, there should be three tables: one for the person, one for all possible interests, and one that links a person to their particular interests. The above example is: CREATE TABLE set_test( rowid INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, myset SET('Travel','Sports','Dancing','Fine Dining') ); I guess I'm not understanding the point here. What practical advantage is there to creating three tables instead of using a set column type? Thank you for any help. Marilyn Davis -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] -- -- Brent Baisley Systems Architect Landover Associates, Inc. Search Advisory Services for Advanced Technology Environments p: 212.759.6400/800.759.0577 -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: set type and normalize
On Fri, 8 Apr 2005, Brent Baisley wrote: I may have been reaching a little on the dynamic sorting since some of what I had in mind entails adding more data to the categories. But at a I see. Thank you. minimum, you would be able to sort ascending or descending and possibly use an index on the sort depending on the query. Depending on how you join your tables in the query, you would be able to replicate the main data so it appears under each category it is assigned to. Think of sorting/grouping. You would be able to sort on the second word in each category if you wanted. You could also added more data to your categories, like a sort order field or a weighting field for relevance sorting. Since you are actually using a database table, you are not just stuck with using a word, you can link other data to the category and use it for calculations or sorting. You could even create subcategories, and sort on category/subcategory. Yes, very powerful and flexible. But I think it's overkill for my situation. And thank you Michael! I made hard-copy of your great explanation and studied it hard. I'm feeling more confident about going ahead with the set type for my table. I have been using an enum for the 5 possible values we have in a field, each one flagging a different process. But very occasionally, two values might be appropriate, and some other considerations must be taken in account to choose the process on the fly. So I'm upgrading from enum to set on the field. Users never see or know about the elements of the SET and certainly never can add one. It's nothing interesting like dancing and traveling. So, I'm feeling pretty confident that it's the right thing for us. Thank you for the great help! I mean *really* great help. Marilyn Davis On Fri, 8 Apr 2005, Michael Stassen wrote: On Apr 7, 2005, at 5:20 PM, Marilyn Davis wrote: Hi MySQL people. I'm trying to understand the pros and cons of the set column type. Here is an excerpt from the article: http://dev.mysql.com/tech-resources/articles/mysql-set-datatype.html Why You Shouldn't Use SET Third, using a set means your data is not normalized. In our above example, we are tracking a person's interests for a hypothetical dating site. In a normalized schema, there should be three tables: one for the person, one for all possible interests, and one that links a person to their particular interests. The above example is: CREATE TABLE set_test( rowid INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, myset SET('Travel','Sports','Dancing','Fine Dining') ); I guess I'm not understanding the point here. What practical advantage is there to creating three tables instead of using a set column type? Thank you for any help. Marilyn Davis A slightly more realistic example might be CREATE TABLE member ( id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, name CHAR(30), interests SET('Travel','Sports','Dancing','Fine Dining') ); with some rows like id nameinterests 1 Joe Travel, Sports 2 Sue Dancing, Fine Dining The 3 table alternative (the normalized version) would be CREATE TABLE member ( id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, name CHAR(30), INDEX (name) ); CREATE TABLE interest ( id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, category CHAR(30), INDEX (category) ); CREATE TABLE member_interests ( member_id INT UNSIGNED NOT NULL, interest_id INT UNSIGNED NOT NULL, PRIMARY KEY (member_id, interest_id), INDEX (interest_id) ); Some rows: member id name 1 Joe 2 Sue interest id category 1 Travel 2 Fine Dining 3 Dancing 4 Sports member_interests member_id interest_id 1 1 1 4 2 2 2 3 Despite the apparent increase in complexity of the 3-table version, it has a number of advantages. Here are a few: * Retrieving the list of interests SET: DESCRIBE member interests; 3 TABLES: SELECT category FROM interest; In the SET case, you have to parse the output to find the interests, while in the 3 TABLES case you simply get a list of interests. * Adding a new interest SET: ALTER TABLE member CHANGE interests interests SET('Travel','Sports','Dancing','Fine Dining', 'Literature'); 3 TABLES: INSERT INTO interest (category) VALUES ('Literature'); In the SET case, you have to change the table structure! Mysql makes a temporary table with the new column definition, copies all the data from the member table to this temporary table, then replaces member with the temporary table. You cannot edit (INSERT, DELETE, UPDATE) the member table till this is done. (See
set type and normalize
Hi MySQL people. I'm trying to understand the pros and cons of the set column type. Here is an excerpt from the article: http://dev.mysql.com/tech-resources/articles/mysql-set-datatype.html Why You Shouldn't Use SET Third, using a set means your data is not normalized. In our above example, we are tracking a person's interests for a hypothetical dating site. In a normalized schema, there should be three tables: one for the person, one for all possible interests, and one that links a person to their particular interests. The above example is: CREATE TABLE set_test( rowid INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, myset SET('Travel','Sports','Dancing','Fine Dining') ); I guess I'm not understanding the point here. What practical advantage is there to creating three tables instead of using a set column type? Thank you for any help. Marilyn Davis -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: set type and normalize
As mentioned in the manual you specified, there are limits to how many items you can have in a set, among other issues. My biggest issue against using sets is that you need alter the table structure whenever you want to change the set. Usually I would want the user to be able to add, modify or delete list items. You do NOT want a user being able to alter the table structure. The three table layout gives you basically unlimited scalability, easier and quicker modification, dynamic sorting, more flexible searching and no character restrictions. On Apr 7, 2005, at 5:20 PM, Marilyn Davis wrote: Hi MySQL people. I'm trying to understand the pros and cons of the set column type. Here is an excerpt from the article: http://dev.mysql.com/tech-resources/articles/mysql-set-datatype.html Why You Shouldn't Use SET Third, using a set means your data is not normalized. In our above example, we are tracking a person's interests for a hypothetical dating site. In a normalized schema, there should be three tables: one for the person, one for all possible interests, and one that links a person to their particular interests. The above example is: CREATE TABLE set_test( rowid INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, myset SET('Travel','Sports','Dancing','Fine Dining') ); I guess I'm not understanding the point here. What practical advantage is there to creating three tables instead of using a set column type? Thank you for any help. Marilyn Davis -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] -- Brent Baisley Systems Architect Landover Associates, Inc. Search Advisory Services for Advanced Technology Environments p: 212.759.6400/800.759.0577 -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]