* Rajesh Kumar > Roger Baklund unknowingly asked us: > > >>What would be a good way to deal with the following...
No, I did not. Scott Haneda asked the question. I replied. :) And I recommended considering the SET column. > But for a couple of reasons, SET is not recommend. > > 1. It introduces fragmentation, and indexes are pretty hard. Fragmentation? > 2. It defies the universal rule of a normalised table: No. This is not an universal rule, and the SET column type _should_ be used when it is the best choice. IMO. I suppose the rule you are referring to is the first normal form (1NF). > You cannot, no matter what, and should not, store more than one Either you can but shouldn't, or you can't... ;) > value in a single cell of a database table. And how would you store a string? A single char per row? ;) This would depend on how you define "one value". It is usefull to define a string as "one value", and not as "a list of characters", because the latter would be in conflict with 1NF. Another example is names: you would often store "Baklund, Roger" in a single column, even if it is (at least) two values: family name and given name. Similarly, a product number "X-23/b4" may contain "multiple values", but it is still usefull to store it in a single column, and define it as the single value "product#". Maybe the "X-" prefix means this product is in some special category, you would still keep a 'prod_cat' column, thus "breaking the rules" of redundancy. This is normal, most databases of some size have such redundancies, it won't prevent you from taking advantage of normalization, and you would still call the database "normalized". Normalization is not an "exact science" in the real world, you have to use what works best. If the five checkboxes are static, they will never change, then I see no reason not to use a SET column, unless, of course, he needs to index on this. And I don't think he needs to, because he said what he needed was statistics for the whole table, thus he needs to read the entire table anyways. To stay within 1NF you can simply define a new term "checkboxvalue", which is an integer between 0 and 31, representing all possible combinations of the five checkboxes. This integer can be stored in a SET column, which also gives you a nice "string interface" to the bit manipulation arithmetic you otherwise would need to perform. And you get both: you can still use bit arithmetic if you like. You _can_ use an index on a SET column, but only for exact matches and ranges, not when searching for "anyone who have checked 'car'". An exception to this is the last item in the set, because that is assigned the highest value. In the example in this thread, 'beetle' is the last value, representing 16, so an indexed range check could be used to find anyone who have or have not checked 'beetle': "WHERE choiceflag >= 16" or "WHERE choiceflag < 16". > Even if you did, it is going to be really hard later on to change the > names of the SETS (this would contradict with the user's choice), and is > going to be still harder to add another choice to your list. Well, ALTER TABLE is not very hard... but not needing to is even easier, of course. :) He said the list would not change. > The best I would recommend, though it takes up more space in the table, > is to have a separate column in another table with a one-to-one join, > and each column as each choice, and each column with the ENUM type of > true/false. one-to-one? why not five columns in the original table? And why not go all the way with two new tables: choices (choiceid tinyint,choice varchar(255)) user_choices (userid int,choiceid tinyint) > Think before you choose, for you don't want to be sorry later. That is allways a good advice. :) -- Roger -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]