One field or numerous datatype-specific tables?

2002-04-17 Thread Rob Emerick

I am having a problem determining the best SQL DB schema to use. I have a
table
which stores attributes on homes, attributes such as  square footage,
price, or includes pool. Now the problem I am having is how to store the
actual value of the attributes. An attribute like square footage or
price can simply be stored as a BIGINT. However, an attribute like
includes pool is a boolean and I don't want to make a bloated table by
storing a 1 or 0 in a BIGINT field.

Plan 1
CREATE TABLE homes (id BIGINT UNSIGNED not null , fkey_attrib BIGINT
UNSIGNED not null , value BIGINT UNSIGNED not null , PRIMARY KEY (id), INDEX
(id), UNIQUE (id))

Let's say I wanted to show all properties that had a price of less than
$200,000. Let's say that the price attribute is attribute number 32. I
would execute the following: SELECT id FROM homes WHERE (fkey_attribid=32
 value20)

Now let's say I wanted to show all properties that had a pool and the
includes pool attribute was attribute number 12.
SELECT id FROM homes WHERE(fkey_attribid=12  value=1)


The problem I am having is in defining the field value. Since it is going
to store numbers that will be larger than the ranger for UNSIGNED INT, I
have to use a BIGINT datatype. It just seems like a waste to me, however, to
have a BIGINT datatype used for boolean attributes (such as includes
pool).


Plan 2
The alternative I came up with is to make the field value be relational. I
would create several other tables such as:

CREATE TABLE homes (id BIGINT UNSIGNED not null PRIMARY KEY (id), INDEX
(id), UNIQUE (id))

CREATE TABLE homes_bigint(id BIGINT UNSIGNED not null , fkey_attrib BIGINT
UNSIGNED not null , value BIGINT UNSIGNED not null , fkey_homes BIGINT
UNSIGNED, PRIMARY KEY (id), INDEX (id), UNIQUE (id))

CREATE TABLE homes_boolean(id BIGINT UNSIGNED not null , fkey_attrib BIGINT
UNSIGNED not null , value TINYINT UNSIGNED not null , fkey_homes BIGINT
UNSIGNED, PRIMARY KEY (id), INDEX (id), UNIQUE (id))

Now it would fall on the responsibility of my script (which I am fine with)
to determine which attribute pairs up with which table. Let's say I wanted
to show all properties that had a price of less than $200,000. Let's say
that the price attribute is attribute number 32. I would execute the
following:
SELECT id FROM homes AS h LEFT JOIN homes_bigint AS hb ON homes USING
hb.fkey_homes=h.id WHERE (hb.fkey_attribid=32  hb.value20)

Now let's say I wanted to show all properties that had a pool and the
includes pool attribute was attribute number 12.
SELECT id FROM homes AS h LEFT JOIN homes_boolean AS hb ON homes USING
hb.fkey_homes=h.id WHERE (hb.fkey_attribid=32  hb.value=1)


So the question I am debating is if I should put everything in just one
table (in which case boolean values would be stored in the same field as
BIGINT values) or if I should have separate tables that are streamlined to
particular datatypes. Would having everything in one table be faster? Or
could I simply optimize the numerous datatype-specific tables in plan 2 and
be just as fast?


-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




RE: One field or numerous datatype-specific tables?

2002-04-17 Thread Kenneth Hylton

Howdy - 

Am I missing something?  Why would you have to create all of your columns as
BIGINT?

Whether the has pool column is a TINYINT, BIGINT, or whatever, the SELECT
statement would be the same, wouldn't it?

Won't SELECT blah blah blah FROM HOMES WHERE HasPool = 1; return houses
with pools regardless of the underlying column type for HasPool?

Even if you absolutely have to create the HasPool column as a BIGINT, you
waste 7 bytes.  Unless you are talking about a database of all houses in
North America, I would really rethink worrying about the wasted space,
anyhow.  Especially versus the complexity of a multiple table solution.


Ken Hylton
Programmer Analyst IV
LEC Systems  Programming

Billing Concepts, Inc.
7411 John Smith Drive
San Antonio, Texas 78229-4898
(210) 949-7261


-Original Message-
From: Rob Emerick [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, April 17, 2002 5:15 PM
To: [EMAIL PROTECTED]
Subject: One field or numerous datatype-specific tables? 


I am having a problem determining the best SQL DB schema to use. I have a
table
which stores attributes on homes, attributes such as  square footage,
price, or includes pool. Now the problem I am having is how to store the
actual value of the attributes. An attribute like square footage or
price can simply be stored as a BIGINT. However, an attribute like
includes pool is a boolean and I don't want to make a bloated table by
storing a 1 or 0 in a BIGINT field.

Plan 1
CREATE TABLE homes (id BIGINT UNSIGNED not null , fkey_attrib BIGINT
UNSIGNED not null , value BIGINT UNSIGNED not null , PRIMARY KEY (id), INDEX
(id), UNIQUE (id))

Let's say I wanted to show all properties that had a price of less than
$200,000. Let's say that the price attribute is attribute number 32. I
would execute the following: SELECT id FROM homes WHERE (fkey_attribid=32
 value20)

Now let's say I wanted to show all properties that had a pool and the
includes pool attribute was attribute number 12.
SELECT id FROM homes WHERE(fkey_attribid=12  value=1)


The problem I am having is in defining the field value. Since it is going
to store numbers that will be larger than the ranger for UNSIGNED INT, I
have to use a BIGINT datatype. It just seems like a waste to me, however, to
have a BIGINT datatype used for boolean attributes (such as includes
pool).


Plan 2
The alternative I came up with is to make the field value be relational. I
would create several other tables such as:

CREATE TABLE homes (id BIGINT UNSIGNED not null PRIMARY KEY (id), INDEX
(id), UNIQUE (id))

CREATE TABLE homes_bigint(id BIGINT UNSIGNED not null , fkey_attrib BIGINT
UNSIGNED not null , value BIGINT UNSIGNED not null , fkey_homes BIGINT
UNSIGNED, PRIMARY KEY (id), INDEX (id), UNIQUE (id))

CREATE TABLE homes_boolean(id BIGINT UNSIGNED not null , fkey_attrib BIGINT
UNSIGNED not null , value TINYINT UNSIGNED not null , fkey_homes BIGINT
UNSIGNED, PRIMARY KEY (id), INDEX (id), UNIQUE (id))

Now it would fall on the responsibility of my script (which I am fine with)
to determine which attribute pairs up with which table. Let's say I wanted
to show all properties that had a price of less than $200,000. Let's say
that the price attribute is attribute number 32. I would execute the
following:
SELECT id FROM homes AS h LEFT JOIN homes_bigint AS hb ON homes USING
hb.fkey_homes=h.id WHERE (hb.fkey_attribid=32  hb.value20)

Now let's say I wanted to show all properties that had a pool and the
includes pool attribute was attribute number 12.
SELECT id FROM homes AS h LEFT JOIN homes_boolean AS hb ON homes USING
hb.fkey_homes=h.id WHERE (hb.fkey_attribid=32  hb.value=1)


So the question I am debating is if I should put everything in just one
table (in which case boolean values would be stored in the same field as
BIGINT values) or if I should have separate tables that are streamlined to
particular datatypes. Would having everything in one table be faster? Or
could I simply optimize the numerous datatype-specific tables in plan 2 and
be just as fast?


-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail
[EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php