Hi,

I am trying to speed up the full text searching on my MySQL database. I currently have 
one table that has approximately 90,000 rows added to it each 24 hour period in 
batches of up to 200 rows constantly. These rows are will be kept for approximately 60 
days, however I am currently working with test data consisting of about 7 days worth 
of data (630,000 rows).

The fulltext index is over two fields and I have 7 other indexes, each over a single 
field. Queries are always done over at least the fulltext fields and one other indexed 
field. I am using MySQL verison 4.0.17-standard-log (binary from mysql.com) on Red Hat 
9.0 Linux  2.4.20-16.9 #1 on an Intel 2.6GHz with 2GB RAM. I have read a considerable 
amount of the MySQL book and searched on the FAQ lists and on google for some ideas 
and here's what I have tried to speed up the searches so far....

Initially test searches would return a matching result set of 7,000 rows in 
approximately 12 secs (longer if inserts are being made on the table). The test query 
looks like:

SELECT DISTINCT indexed_field_1 FROM table WHERE (MATCH 
(fulltext_field_one,fulltext_field_two) AGAINST ('test words' IN BOOLEAN MODE)) AND 
indexed_field_2 = 'aword' ORDER BY indexed_field_3 DESC

Firstly I tried allocating more memory to the MySQL server according to the advice in 
the MySQL book, however this did not improve the response times. I then tried putting 
a LIMIT 10 clause in the statement as the user only wants to look at a small range of 
the results (in my testing 10 rows). This improved the query time to about 6 secs.

I then talked to another person using a similar sized database with similar indexes 
and queries and they were managing to get results back in 2 - 3 secs. I looked at 
their queries and saw that they queries lack an ORDER BY clause, i.e.

SELECT DISTINCT field_1 FROM table WHERE (MATCH 
(fulltext_field_one,fulltext_field_two) AGAINST ('test words' IN BOOLEAN MODE)) AND 
field_2 = 'aword'  LIMIT 10

so I tried this on my database.... and my queries came back in under 1 second. This 
was great, and I am guessing is down to the lack of sorting required by the ORDER BY 
clause. I then tried to think of a way to store the data in such a way as an ORDER BY 
would not be needed and so I tried storing the data with the PRIMARY KEY as the field 
I always order by. (btw.. the field is a date field). I had read on the MySQL site 
that the PRIMARY KEY is a clustered key and therefore the data rows are stored in the 
order of the PRIMARY KEY.

After recreating the table with the ORDER BY field now as the PRIMARY KEY I tried the 
query again with out the ORDER BY clause but the rows seem to come out in the order 
that they were inserted into the table. This means that I am always getting the 
earliest inserted 10 rows that match the query rather than the latest inserted 10 rows 
which is what my users want. Maybe I misread/misunderstood the text I found on the 
MySQL site about the clustered index/primary key field.

If anyone has any ideas on how I can try to avoid the need of the ORDER BY clause, yet 
always get the data results ordered in a certain manner ( I know that this seems like 
a contradiction! ) by doing something with the way the data set is stored, or in fact 
any other way possible that would be great. Other comments/ideas on where I may be 
going wrong or other options I have not appeared to have considered would be great as 
well.

Thanks in advance for all of you who take the time to read this and those who are kind 
enough to reply.

Tom




Reply via email to