Problem:
1. in a phone book, a single field may contain several words, e.g.
District 52 Police Station.
2. The query for this record may contain some or all of these words, but not
necessarily in that order, e.g.  Police district 52.
3. Searching for all of the query words can be done in many ways, most of
 them slower than getting truthful admissions from Enron executives.

So here are some methods I've come up with:

1. Make a fulltext index of the field, and do fulltext searching on all the
query words.
(In mysql 3.23.48, this is often over 30 seconds, which is not fast enough)

2. Make an inverted index as a separate table, with each occurring word and
all of its record numbers in the main table.  (Sounds slow but the 10 hours
 it takes beats the 45 hours Mysql takes to create a FULLTEXT index on that
 table!)
        2a) search all the query words in the inverted index, and the main table
        in one big join.  (10 minutes typical!)
        2b) find the least-occurring query word, make a temp table with it, (2 min)
              now join to the main table and search with an unanchored LIKE on other
              query words.  (another 2 min) (Total 4 min)
        2c) find the 2 least-occurring query words and make 2 temp tables, do a
 3-way join with the temp tables and the main table (1 min, 1 min, 2 min)
 (Total 5 min)
        2d) same as 2c but index each of the tmp tables as you create it
                (1 min, 1 min, 15 sec) (Total 2:15)

Okay, it's getting better but still not very optimal.

The hardware is adequate (dual-CPU 700Mhz each, 1GB RAM, mysql configured for
big RAM, etc)

I'm getting desperate because I'm sure this should not be so slow.   The
problem isn't the 30-second wait per se, it's the fact that during those 30
seconds the server is more loaded:  1 or 2 new requests come in every second,
and once there are 50 of these in the queue, the server becomes unusable...

Things I'm trying next:
Upgrading to 4.0.1 for faster FULLTEXT.
Examining join syntax to find ways to help Mysql optimize the search
(beats me!)
Asking the list for advice...

Steve.

For join gurus:
Sample queries for "TEATRO GRECO" in any order:


select t1.pointer, t0.last_name from Invfile as t2, Invfile as t1, White as 
t0 where t2.word="TEATRO" and t1.word="GRECO" and t0.rec_no=t1.pointer;

select t1.pointer, t0.last_name from Invfile as t1, White as t0 where 
t1.word="GRECO" and t0.rec_no=t1.pointer;  

select t1.pointer, t2.pointer from Invfile as t1, Invfile as t2 where t1.word 
= "TEATRO" and t2.word = "GRECO" and t1.pointer=t2.pointer;


create TEMPORARY TABLE tmp select word,pointer from Invfile where 
word="TEATRO"; 
select White.rec_no, White.last_name, tmp.pointer from White, tmp where 
tmp.pointer=White.rec_no and White.last_name like "%GRECO%";

create TEMPORARY TABLE tmp (index pointer(pointer)) select word,pointer from 
Invfile where word="TEATRO"; 
create TEMPORARY TABLE tmp2 (index pointer(pointer)) select word,pointer from 
Invfile where word="GRECO";   
 SELECT  White.last_name, White.rec_no, tmp.pointer from White,tmp,tmp2 where 
rec_no=tmp.pointer and rec_no=tmp2.pointer;





---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Reply via email to