I am having a problem determining the best SQL DB schema to use. I have a table which stores attributes on homes, attributes such as "square footage", "price", or "includes pool". Now the problem I am having is how to store the actual value of the attributes. An attribute like "square footage" or "price" can simply be stored as a BIGINT. However, an attribute like "includes pool" is a boolean and I don't want to make a bloated table by storing a 1 or 0 in a BIGINT field.
Plan 1 CREATE TABLE homes (id BIGINT UNSIGNED not null , fkey_attrib BIGINT UNSIGNED not null , value BIGINT UNSIGNED not null , PRIMARY KEY (id), INDEX (id), UNIQUE (id)) Let's say I wanted to show all properties that had a price of less than $200,000. Let's say that the price attribute is attribute number "32". I would execute the following: "SELECT id FROM homes WHERE (fkey_attribid=32 && value<200000)" Now let's say I wanted to show all properties that had a pool and the "includes pool" attribute was attribute number 12. "SELECT id FROM homes WHERE(fkey_attribid=12 && value=1)" The problem I am having is in defining the field "value". Since it is going to store numbers that will be larger than the ranger for UNSIGNED INT, I have to use a BIGINT datatype. It just seems like a waste to me, however, to have a BIGINT datatype used for boolean attributes (such as "includes pool"). Plan 2 The alternative I came up with is to make the field "value" be relational. I would create several other tables such as: CREATE TABLE homes (id BIGINT UNSIGNED not null PRIMARY KEY (id), INDEX (id), UNIQUE (id)) CREATE TABLE homes_bigint(id BIGINT UNSIGNED not null , fkey_attrib BIGINT UNSIGNED not null , value BIGINT UNSIGNED not null , fkey_homes BIGINT UNSIGNED, PRIMARY KEY (id), INDEX (id), UNIQUE (id)) CREATE TABLE homes_boolean(id BIGINT UNSIGNED not null , fkey_attrib BIGINT UNSIGNED not null , value TINYINT UNSIGNED not null , fkey_homes BIGINT UNSIGNED, PRIMARY KEY (id), INDEX (id), UNIQUE (id)) Now it would fall on the responsibility of my script (which I am fine with) to determine which attribute pairs up with which table. Let's say I wanted to show all properties that had a price of less than $200,000. Let's say that the price attribute is attribute number "32". I would execute the following: "SELECT id FROM homes AS h LEFT JOIN homes_bigint AS hb ON homes USING hb.fkey_homes=h.id WHERE (hb.fkey_attribid=32 && hb.value<200000)" Now let's say I wanted to show all properties that had a pool and the "includes pool" attribute was attribute number 12. "SELECT id FROM homes AS h LEFT JOIN homes_boolean AS hb ON homes USING hb.fkey_homes=h.id WHERE (hb.fkey_attribid=32 && hb.value=1)" So the question I am debating is if I should put everything in just one table (in which case boolean values would be stored in the same field as BIGINT values) or if I should have separate tables that are streamlined to particular datatypes. Would having everything in one table be faster? Or could I simply optimize the numerous datatype-specific tables in plan 2 and be just as fast? --------------------------------------------------------------------- Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail <[EMAIL PROTECTED]> To unsubscribe, e-mail <[EMAIL PROTECTED]> Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php