I am having a problem determining the best SQL DB schema to use. I have a
table
which stores attributes on homes, attributes such as  "square footage",
"price", or "includes pool". Now the problem I am having is how to store the
actual value of the attributes. An attribute like "square footage" or
"price" can simply be stored as a BIGINT. However, an attribute like
"includes pool" is a boolean and I don't want to make a bloated table by
storing a 1 or 0 in a BIGINT field.

Plan 1
CREATE TABLE homes (id BIGINT UNSIGNED not null , fkey_attrib BIGINT
UNSIGNED not null , value BIGINT UNSIGNED not null , PRIMARY KEY (id), INDEX
(id), UNIQUE (id))

Let's say I wanted to show all properties that had a price of less than
$200,000. Let's say that the price attribute is attribute number "32". I
would execute the following: "SELECT id FROM homes WHERE (fkey_attribid=32
&& value<200000)"

Now let's say I wanted to show all properties that had a pool and the
"includes pool" attribute was attribute number 12.
"SELECT id FROM homes WHERE(fkey_attribid=12 && value=1)"


The problem I am having is in defining the field "value". Since it is going
to store numbers that will be larger than the ranger for UNSIGNED INT, I
have to use a BIGINT datatype. It just seems like a waste to me, however, to
have a BIGINT datatype used for boolean attributes (such as "includes
pool").


Plan 2
The alternative I came up with is to make the field "value" be relational. I
would create several other tables such as:

CREATE TABLE homes (id BIGINT UNSIGNED not null PRIMARY KEY (id), INDEX
(id), UNIQUE (id))

CREATE TABLE homes_bigint(id BIGINT UNSIGNED not null , fkey_attrib BIGINT
UNSIGNED not null , value BIGINT UNSIGNED not null , fkey_homes BIGINT
UNSIGNED, PRIMARY KEY (id), INDEX (id), UNIQUE (id))

CREATE TABLE homes_boolean(id BIGINT UNSIGNED not null , fkey_attrib BIGINT
UNSIGNED not null , value TINYINT UNSIGNED not null , fkey_homes BIGINT
UNSIGNED, PRIMARY KEY (id), INDEX (id), UNIQUE (id))

Now it would fall on the responsibility of my script (which I am fine with)
to determine which attribute pairs up with which table. Let's say I wanted
to show all properties that had a price of less than $200,000. Let's say
that the price attribute is attribute number "32". I would execute the
following:
"SELECT id FROM homes AS h LEFT JOIN homes_bigint AS hb ON homes USING
hb.fkey_homes=h.id WHERE (hb.fkey_attribid=32 && hb.value<200000)"

Now let's say I wanted to show all properties that had a pool and the
"includes pool" attribute was attribute number 12.
"SELECT id FROM homes AS h LEFT JOIN homes_boolean AS hb ON homes USING
hb.fkey_homes=h.id WHERE (hb.fkey_attribid=32 && hb.value=1)"


So the question I am debating is if I should put everything in just one
table (in which case boolean values would be stored in the same field as
BIGINT values) or if I should have separate tables that are streamlined to
particular datatypes. Would having everything in one table be faster? Or
could I simply optimize the numerous datatype-specific tables in plan 2 and
be just as fast?


---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Reply via email to