Hi all you mysql gurus,

I have 400 000 unique strings where each and every one of these strings are 
associated with 1 - 50 (appr.) integer values.

Now, pretty simple for you guys I guess, but how will I design my
database to make a search interface against this data as rapid as possible?

My first guess for a table whould be

   TABLE list
string     int
=======    =====
string1    234
string1    6323
string1    343
string2    313
string2    9055
...
string434  5445
string434  12
...


But I come to a grinding halt when I realize this table will be 400 000 rows 
big, times the sum over the number of associated integers for each unique 
string... let's say each unique string always have 10 associated integers. 
Then we have a table of 400 000 x 10 = 4 000 000 rows. Is this really the 
best approach?

A search would appropriately look like this

Search: string434

and produce the following output

Result: 5445 12.

I guess the sql could look like this:

SELECT int FROM list WHERE string = 'string434'

What would be the optimal select query if I would like to return ints for 
strings that begin with say 'str'?

And what if I wanted to search for strings where I only know the mid part or 
the end of the string? How then would a performance stable sql command look?

If the above design suggestion is the optimal one, then I guess the internal 
database workings do not contain duplicates of the string values as they do 
in the table...? I'm kinda hoping a binary tree is built or something 
similar. Though I read somewhere that in order for the db to optimize and 
create btrees and the like, some data must be indexed... but how? I can't 
very well make the string column 'primary' since it isn't...?

Very, very, very interested in hearing what you pros have to say! Thx a lot 
for this forum!

/Tommy 



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to