Well, here's the semi-relational way to do it:
CREATE TABLE `MyStrings` (
`ID` int NOT NULL AUTO_INCREMENT ,
`String` varchar(100) NOT NULL DEFAULT '' ,
UNIQUE `StringIndex`(`String`)
PRIMARY KEY ('ID') )
TYPE=MYISAM
CREATE TABLE `MyIntegers` (
`ID` int NOT NULL AUTO_INCREMENT ,
`StringID` int NOT NULL DEFAULT '' ,
`Integer` int NOT NULL DEFAULT '' ,
INDEX `StringIDIndex`(`StringID`),
PRIMARY KEY (`ID`) )
TYPE=MYISAM
MyStrings contains your unique strings with a unique index on those strings.
MyIntegers is linked to MyStrings via the StringID foreign key. Lookups go
something like this:
SELECT ID FROM MyStrings WHERE String='something'
This uses the unique index on String which means it should be quite fast.
SELECT Integer FROM MyIntegers WHERE StringID=[ID From Previous Select Goes
Here]
This should return one or more integers associated with that key.
This way will pretty much give you good speed as your entries grow, even
beyond 400k. However, it may not be nessicary to do this. You can probably
keep your original table design, and just index on the strings. It's not
like 400,000 indexed strings are going to give MySQL a hard time, unless you
have poor hardware. Still, I'd recommend considering the above example as a
model for your solution.
----- Original Message -----
From: "Tommy Svensson (InfoGrafix)" <[EMAIL PROTECTED]>
To: <mysql@lists.mysql.com>
Sent: Tuesday, June 07, 2005 1:49 PM
Subject: What's the optimal db design choice for my 400 000 entries?
Hi all you mysql gurus,
I have 400 000 unique strings where each and every one of these strings
are associated with 1 - 50 (appr.) integer values.
Now, pretty simple for you guys I guess, but how will I design my
database to make a search interface against this data as rapid as
possible?
My first guess for a table whould be
TABLE list
string int
======= =====
string1 234
string1 6323
string1 343
string2 313
string2 9055
...
string434 5445
string434 12
...
But I come to a grinding halt when I realize this table will be 400 000
rows big, times the sum over the number of associated integers for each
unique string... let's say each unique string always have 10 associated
integers. Then we have a table of 400 000 x 10 = 4 000 000 rows. Is this
really the best approach?
A search would appropriately look like this
Search: string434
and produce the following output
Result: 5445 12.
I guess the sql could look like this:
SELECT int FROM list WHERE string = 'string434'
What would be the optimal select query if I would like to return ints for
strings that begin with say 'str'?
And what if I wanted to search for strings where I only know the mid part
or the end of the string? How then would a performance stable sql command
look?
If the above design suggestion is the optimal one, then I guess the
internal database workings do not contain duplicates of the string values
as they do in the table...? I'm kinda hoping a binary tree is built or
something similar. Though I read somewhere that in order for the db to
optimize and create btrees and the like, some data must be indexed... but
how? I can't very well make the string column 'primary' since it isn't...?
Very, very, very interested in hearing what you pros have to say! Thx a
lot for this forum!
/Tommy
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]