I'm working on a project which requires me to keep track of objects, each of which can have an arbitrary number of attributes. Although there will be many attributes that an object can have, the data types of those attributes won't be all that varried (int, float, text, boolean, date, etc.). My question is: what tradeoffs have you seen in picking a solution to this problem?

In other words:

create table attrs (id serial primary key, name text);
create table obj (id serial primary key, name text);
create table att (oid int references obj.id, aid int references attrs.id, value text);

(everything smashed down to text by the application)

versus:

create table attrs (id serial primary key, name text);
create table obj (id serial primary key, name text);
create table att_int (oid int references obj.id, aid int references attrs.id, value int); create table att_float (oid int references obj.id, aid int references attrs.id, value float); create table att_text (oid int references obj.id, aid int references attrs.id, value text); create table att_bool (oid int references obj.id, aid int references attrs.id, value boolean); create table att_date (oid int references obj.id, aid int references attrs.id, value date);

(everything kept distinct, probably with the application using stored procs)

or even versus:

create table attrs (id serial primary key, name text);
create table obj (id serial primary key, name text);
create table att (oid int references obj.id, aid int references attrs.id, value_int int, value_float float, value_text text, value_bool bool, value_date date);

(the worst or the best of both worlds?)

The first certainly seems simplest, while the second seems like it offers much better ability for constraint checking and probably more compact storage. The second also seems like a perfect example for inheritence, except that I don't believe inheritence allows for some usuful things like uniqueness, so if I wanted to make my primary key in the att_* tables
(oid,aid), then I dont' think I could use inheritence.

The third option doesn't seem all that different than the second option to me, though maybe it has benefits I'm not seeing.


Anyway, this seems like a common problem without a perfect solution, and I'm sure people must have hindsight opinions on how they solved it. Your thoughts?

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to