Hi,
I'm thinking about using a non standard design for my database.
There are up to 1 million records in the database. Each record
has an integer UID. There is another table with keywords for
each of the records. Records typically have 0-15 keywords associated.
The number of keywords is small 100-1000. Currently I have
CREATE TABLE data (
dataId INTEGER PRIMARY KEY AUTOINCREMENT,
...
);
CREATE TABLE words (
wordId INTEGER PRIMARY KEY AUTOINCREMENT,
word TEXT PRIMARY KEY ON CONFLICT IGNORE
);
CREATE TABLE keywords (
wordId INTEGER,
dataId INTEGER,
PRIMARY KEY (wordId, dataId) ON CONFLICT IGNORE
);
CREATE INDEX keywordsDataIdIndex ON keywords (dataId);
This creates in 2 big indexes for keywords...
When I display data items, I also display all associated keywords.
Therefore I created a cat function, that creates a comma separated
list of items:
SELECT *,(SELECT cat(category) FROM keywords WHERE keywords.dataId =
data.dataId) FROM data ...
I just wonder, if I should not forget about database normalization
and create one table per keyword:
CREATE TABLE data (
dataId INTEGER PRIMARY KEY AUTOINCREMENT,
keywords TEXT, -- a comma separated list of word ids
...
);
CREATE TABLE words (
wordId INTEGER PRIMARY KEY AUTOINCREMENT,
word TEXT PRIMARY KEY ON CONFLICT IGNORE
);
-- for each word ID there's a table
CREATE TABLE keywords_0 (
dataId INTEGER PRIMARY KEY,
);
Is there a limit in the number of tables?
If I choose keyword 1, 7 and 42 I would
SELECT * FROM keywords_1 UNION SELECT * FROM keywords_7' UNION SELECT * FROM
keywords_42'
I expect the database to become significantly smaller...
Has anybody tried something like this?
Michael