On Fri, 8 Apr 2005, Brent Baisley wrote:

> I may have been reaching a little on the dynamic sorting since some of 
> what I had in mind entails adding more data to the categories. But at a 

I see.  Thank you.

> minimum, you would be able to sort ascending or descending and possibly 
> use an index on the sort depending on the query. Depending on how you 
> join your tables in  the query, you would be able to replicate the main 
> data so it appears under each category it is assigned to. Think of 
> sorting/grouping.
> You would be able to sort on the second word in each category if you 
> wanted. You could also added more data to your categories, like a sort 
> order field or a weighting field for relevance sorting. Since you are 
> actually using a database table, you are not just stuck with using a 
> word, you can link other data to the category and use it for 
> calculations or sorting. You could even create subcategories, and sort 
> on category/subcategory.
> 

Yes, very powerful and flexible.  But I think it's overkill for my
situation.

And thank you Michael!  I made hard-copy of your great explanation and
studied it hard.

I'm feeling more confident about going ahead with the set type for my
table.  

I have been using an enum for the 5 possible values we have in a
field, each one flagging a different process.  But very occasionally,
two values might be appropriate, and some other considerations must be
taken in account to choose the process on the fly.

So I'm upgrading from enum to set on the field.

Users never see or know about the elements of the SET and certainly
never can add one.  It's nothing interesting like dancing and
traveling.

So, I'm feeling pretty confident that it's the right thing for us.

Thank you for the great help!  I mean *really* great help.

Marilyn Davis


On Fri, 8 Apr 2005, Michael Stassen wrote:

> 
> On Apr 7, 2005, at 5:20 PM, Marilyn Davis wrote:
> 
> > Hi MySQL people.
> >
> > I'm trying to understand the pros and cons of the "set" column type.
> >
> > Here is an excerpt from the article:
> >
> > http://dev.mysql.com/tech-resources/articles/mysql-set-datatype.html
> >
> >
> >       Why You Shouldn't Use SET
> >
> > "Third, using a set means your data is not normalized. In our above
> > example, we are tracking a person's interests for a hypothetical
> > dating site. In a normalized schema, there should be three tables: one
> > for the person, one for all possible interests, and one that links a
> > person to their particular interests."
> >
> > The "above example" is:
> >
> >   CREATE TABLE set_test(
> >     rowid INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
> >     myset SET('Travel','Sports','Dancing','Fine Dining')
> >   );
> >
> > I guess I'm not understanding the point here.  What practical
> > advantage is there to creating three tables instead of using a set
> > column type?
> >
> > Thank you for any help.
> >
> > Marilyn Davis
> 
> A slightly more realistic example might be
> 
>    CREATE TABLE member
>    (  id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
>       name CHAR(30),
>       interests SET('Travel','Sports','Dancing','Fine Dining')
>    );
> 
> with some rows like
> 
>    id  name    interests
>     1  Joe     Travel, Sports
>     2  Sue     Dancing, Fine Dining
> 
> The 3 table alternative (the normalized version) would be
> 
>    CREATE TABLE member
>    (  id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
>       name CHAR(30),
>       INDEX (name)
>    );
> 
>    CREATE TABLE interest
>    (  id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
>       category CHAR(30),
>       INDEX (category)
>    );
> 
>    CREATE TABLE member_interests
>    (  member_id INT UNSIGNED NOT NULL,
>       interest_id INT UNSIGNED NOT NULL,
>       PRIMARY KEY (member_id, interest_id),
>       INDEX (interest_id)
>    );
> 
> Some rows:
> 
> member
>    id  name
>     1  Joe
>     2  Sue
> 
> interest
>    id  category
>     1  Travel
>     2  Fine Dining
>     3  Dancing
>     4  Sports
> 
> member_interests
>    member_id  interest_id
>       1          1
>       1          4
>       2          2
>       2          3
> 
> Despite the apparent increase in complexity of the 3-table version, it 
> has a number of advantages.  Here are a few:
> 
> * Retrieving  the list of interests
>    SET:
>      DESCRIBE member interests;
> 
>    3 TABLES:
>      SELECT category FROM interest;
> 
> In the SET case, you have to parse the output to find the interests, 
> while in the 3 TABLES case you simply get a list of interests.
> 
> * Adding a new interest
>    SET:
>      ALTER TABLE member CHANGE interests interests 
> SET('Travel','Sports','Dancing','Fine Dining', 'Literature');
> 
>    3 TABLES:
>      INSERT INTO interest (category) VALUES ('Literature');
> 
> In the SET case, you have to change the table structure!  Mysql makes a 
> temporary table with the new column definition, copies all the data 
> from the member table to this temporary table, then replaces member 
> with the temporary table.  You cannot edit (INSERT, DELETE, UPDATE) the 
> member table till this is done.  (See 
> <http://dev.mysql.com/doc/mysql/en/alter-table.html>)  Also, while 
> mysql should correctly convert the interests data to the new column 
> definition, you should make a backup first, just in case.  In the 3 
> TABLE case, on the other hand, this is a simple, quick insert.
> 
> * Finding members with a particular interest:
> 
>    SET:
>      SELECT id, name FROM member WHERE FIND_IN_SET('Sports', interests);
> 
>    3 TABLES:
>      SELECT m.id, m.name
>      FROM member m
>      JOIN member_interest mi ON m.id = mi.member_id
>      JOIN interest i ON mi.interest_id = i.id
>      WHERE i.category = 'Sports';
> 
> In the SET case, the query *looks* simpler, but mysql cannot use an 
> index on member.interests to narrow the search.  That is, mysql has to 
> examine every single row of the member table (a full table scan), 
> looking at the value in the interests column to see if 'Sports' is in 
> there.  That's slow, and it gets worse as the table grows.  In the 3 
> TABLE case, however, the index on interest.category will be used to 
> find precisely those rows with the correct interest, without looking at 
> any other rows.  That's fast, and scales better as the table grows.
> 
> If your table is small and the load on your server is low, the SET type 
> may work fine.  If this is a table of a 100 or so friends and 
> relatives, for example, to be used by you (and perhaps your spouse), 
> the limitations of the SET type may not be a problem.  As the table 
> gets larger and/or the number of users goes up, however, those 
> limitations will kill you.
> 
> Michael
> 
> 

-- 




-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to