Title: [jdjlist] Another DB design (was: Interview Question: DB Design from Objects)
Basically, you have discussed the 3 possible ways to
map class hierarchies to tables:
1:1
----
each
class maps to its own table, i.e 3 classes: 3 tables
roll
up
---------
attributes of subclasses are stored in the root class
table, i.e. 3 classes: 1 table
roll
down
-------------
attributes of the root class are stored redundantly in
the sub class tables, i.e. 3 classes: 2 tables
the
pros and cons have been discussed so I won't go into that (performance, row/db
size, ...),
except
for tow very important issues: the data model itself and the database
integrity
in the
real world, not all potential users of your application data will or can access
the data through the
application interface. some will have to access or
modify the data directly in the database.
if you
roll up or roll down your class attributes, you cannot ensure data consistency
on the database level.
and
you can no longer understand the database schema on its
own.
Consider the latest example that Greg mentioned (roll
down): Customer and WholesaleCustomer probably buy something so you would have
an Order class and an Order table. The Order class is related to a
CustomerSuperclass which is not a big deal, but how about the Order table? You
need to store the id and objectType in the Order table but you cannot define a
foreign key in the db since the target table in which id must be present is not
fixed but depends on the actual value of objectType. You can of course use a
trigger, but the association that was clearly defined in your object model is
lost in your data model. When somebody (a controller?) tries to
create management reports typically using MS Access, how will he or she
know how the data relates? And, by the way, how do you ensure that an "id"
is not used twice for both a Customer and a WholesaleCustomer? Again you can use
a trigger just in case somebody tries to create a Customer entry "the fast
way", but that costs more database cpu time than an index/primary key, so you
end up having to buy "big iron" database servers instead of investing in OO
technology.
I
normally go for 1:1 and sometimes for performance reasons roll up. Roll down ist
more difficult to handle and I think it only makes sense if you can
separate the data models completely, e.g. Customer+CustomerOrder and
WholesaleCustomer+WholesaleCustomerOrder. It might make sense if you need to
store historical data, since your normal tables will fill up (and performance go
down) over time. Say you need to store the order 10 years back for tax purposes.
If you store them in your normal order table and your have a reasonable amount
of orders they will drag your db performance down, especially if someone (a
controller again?) forgets a join clause and makes a cartesian join (SELECT
* FROM A, B) or if they cause a full table scan using functions in the
where clause (WHERE UPPER(NAME) LIKE 'KIM%').
Any
opinions?
Kim
"Curiouser and curiouser!"
Alice in
Wonderland
OK, here is ANOTHER DB design scheme I have just heard
yesterday, so I want to run this by everybody. My acquaintance tried to
model his DB tables according to the "classic" accepted way, that is how Tomm
Carr and many others have suggested. Shared attributes in a "root" table
and all the object-specific child-class attributes in other tables, joined by
FKs.
Unfortunately, his schema was quite extensive and was changing
rapidly during development. So, my acquaintance run head-first into
exactly the same problems I have mentioned: 7-8 join queries were the norm and
SQL base quickly became unmaintainable due to constant changes in the
schema.
So my acquaintance implemented a MODIFIED 3-table
design:
CustomersSuperclass
----------------
id
objectType
----------------
----------------
2 tables:
Customer
----------------
id
name
address
favorites
comments
----------------
----------------
WholesaleCustomer
----------------
id
name
address
limit
pin
department
----------------
----------------
As you can see, this results in some field duplication (name,
address), but the tables are still pretty well normalized. A huge
advantage of this scheme, is that you can still perform reports on the entire
set of various Customer subtypes, but in order to read an individual customer,
you no longer need to perform a join. Instead, we can just read it directly
from the WholesaleCustomer table. The added bonus is that you end up
with no null fields and additional data integrity that you do not get with a
1-table design.
Any thoughts?
Greg
===========================================================
ORIGINAL PROBLEM:
CustomersAbstractSuperclass
----------------
id
name
address
----------------
----------------
inherited individually by:
Customer
----------------
favorites
comments
----------------
----------------
and:
WholesaleCustomer
----------------
limit
pin
department
----------------
----------------
To change your membership options, refer
to:
http://www.sys-con.com/java/list.cfm
To change your membership options, refer to:
http://www.sys-con.com/java/list.cfm