Re: [sqlite] Non standard database design (max no if tables?)

Clay Dowling Sat, 28 Jan 2006 06:54:50 -0800

Michael,

Your first-draft design is going to be easier to manage and more
scalable.  Stick with that, rather than the curious notion of having one
table per keyword.  Speaking as a frequent maintenance programmer,
that's the kind of design that would force me to hunt somebody down.  It
probably will give very good performance, but it's unlikely that your
current list of keywords will remain the final list of keywords.

The real solution of course it to try it both ways and see which is
faster.  My suspicion is that you won't find a gigantic performance
improvement by having a separate table for each keyword.  You will have
a giant maintenance headache though.  The money you spend on programming
time could have been more profitably invested in better hardware.  Even
a dirt cheap programmer costs about $500 for a day of work once you
factor in taxes and minimal benefits.  A week of that and you've paid
for a decent server.  A month of extra programming over the life of the
project and you've paid for a very good server, with fast processor and
disks that could have netted you the improved performance without the
extra coding.

It's a cold way of looking at it, but I've had to deal with too many
designs that saved a little time and gained a little speed up front, but
that cost a lot of money down the road in longer development times.

Clay Dowling

Michael Sizaki wrote:
> Hi,
> 
> I'm thinking about using a non standard design for my database.
> There are up to 1 million records in the database. Each record
> has an integer UID. There is another table with keywords for
> each of the records. Records typically have 0-15 keywords associated.
> The number of keywords is small 100-1000. Currently I have
> 
> CREATE TABLE data (
>     dataId INTEGER PRIMARY KEY AUTOINCREMENT,
>         ...
> );
> CREATE TABLE words (
>     wordId INTEGER PRIMARY KEY AUTOINCREMENT,
>     word TEXT PRIMARY KEY ON CONFLICT IGNORE
> );
> 
> CREATE TABLE keywords (
>     wordId INTEGER,
>     dataId INTEGER,
>     PRIMARY KEY (wordId, dataId)     ON CONFLICT IGNORE
> );
> CREATE INDEX keywordsDataIdIndex ON keywords (dataId);
> 
> This creates in 2 big indexes for keywords...
> 
> When I display data items, I also display all associated keywords.
> Therefore I created a cat function, that creates a comma separated
> list of items:
> 
> SELECT *,(SELECT cat(category) FROM keywords WHERE keywords.dataId =
> data.dataId) FROM data ...
> 
> I just wonder, if I should not forget about database normalization
> and create one table per keyword:
> 
> CREATE TABLE data (
>     dataId      INTEGER PRIMARY KEY AUTOINCREMENT,
>         keywords  TEXT, -- a comma separated list of word ids
>         ...
> );
> CREATE TABLE words (
>     wordId  INTEGER PRIMARY KEY AUTOINCREMENT,
>     word    TEXT PRIMARY KEY ON CONFLICT IGNORE
> );
> 
> -- for each word ID there's a table
> CREATE TABLE keywords_0 (
>     dataId    INTEGER PRIMARY KEY,
> );
> 
> Is there a limit in the number of tables?
> 
> If I choose keyword 1, 7 and 42 I would
> 
> SELECT * FROM keywords_1 UNION SELECT * FROM keywords_7' UNION SELECT *
> FROM keywords_42'
> 
> I expect the database to become significantly smaller...
> 
> Has anybody tried something like this?
> 
> Michael

Re: [sqlite] Non standard database design (max no if tables?)

Reply via email to