Why don't my replies go to the list, instead of the person posting the
message? Is this the way it's supposed to be on this list? Here's one I sent
earlier that didn't make it to the list...

---------------------------------
I'm going to be implementing a keyword search pretty soon myself, so I'd
like to see you solve this.  :)

What about doing some sort of prefix indexing, instead of indexing the
entire word? So, for every word you put in your indexed "word" table, you
chop off the first few letters and index that:

create table words (
  prefix char(3) not null,
  word char(25) not null,          // or whatever size you want
  id char(25),                           // whatever you're associating
  primary key (prefix, word, id)
);

When you search, you chop off the first 3 letters of your search criteria:

select id from words
where prefix="Pol" and word="Police";

I suppose you could even take this one step further and create char(1)
fields for the first N chars of your words, and then index across those.
That should cut down the size of the left most part of the index and maybe
make it easier/quicker to find records. Also, it might be quicker if you
keep your columns static (char instead of varchar), if you're not already.
Do you have enough memory to put this into a HEAP table -- or at least make
your temp tables HEAP tables?

Thanks,
--jeff

sql,query
----- Original Message -----
From: "Steve Rapaport" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Tuesday, February 26, 2002 4:56 PM
Subject: Help Optimizing a multi-word search


>
> Problem:
> 1. in a phone book, a single field may contain several words, e.g.
> District 52 Police Station.
> 2. The query for this record may contain some or all of these words, but
not
> necessarily in that order, e.g.  Police district 52.
> 3. Searching for all of the query words can be done in many ways, most of
them
> slower than getting truthful admissions from Enron executives.
>
> So here are some methods I've come up with:
>
> 1. Make a fulltext index of the field, and do fulltext searching on all
the
> query words.
> (In mysql 3.23.48, this is often over 30 seconds, which is not fast
enough)
>
> 2. Make an inverted index as a separate table, with each occurring word
and
> all of its record numbers in the main table.  (Sounds slow but the 10
hours it
> takes beats the 45 hours Mysql takes to create a FULLTEXT index on that
> table!)
> 2a) search all the query words in the inverted index, and the main table
> in one big join.  (10 minutes typical!)
> 2b) find the least-occurring query word, make a temp table with it, (2
min)
>       now join to the main table and search with an unanchored LIKE on
other
>       query words.  (another 2 min) (Total 4 min)
> 2c) find the 2 least-occurring query words and make 2 temp tables, do a
3-way
>       join with the temp tables and the main table (1 min, 1 min, 2 min)
> (Total 5 min)
> 2d) same as 2c but index each of the tmp tables as you create it
> (1 min, 1 min, 15 sec) (Total 2:15)
>
> Okay, it's getting better but still not very optimal.
>
> The hardware is adequate (dual-CPU 700Mhz each, 1GB RAM, mysql configured
for
> big RAM, etc)
>
> I'm getting desperate because I'm sure this should not be so slow.   The
> problem isn't the 30-second wait per se, it's the fact that during those
30
> seconds the server is more loaded:  1 or 2 new requests come in every
second,
> and once there are 50 of these in the queue, the server becomes
unusable...
>
> Things I'm trying next:
> Upgrading to 4.0.1 for faster FULLTEXT.
> Examining join syntax to find ways to help Mysql optimize the search
> (beats me!)
> Asking the list for advice...
>
> Steve.
>
> P.S.  I haven't included my sample queries but I will if asked...
>
>
>
>
>
>
> ---------------------------------------------------------------------
> Before posting, please check:
>    http://www.mysql.com/manual.php   (the manual)
>    http://lists.mysql.com/           (the list archive)
>
> To request this thread, e-mail <[EMAIL PROTECTED]>
> To unsubscribe, e-mail
<[EMAIL PROTECTED]>
> Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php
>



---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Reply via email to