On Wednesday 27 February 2002 07:47 am, Jeff Kilbride wrote:

> ---------------------------------
> I'm going to be implementing a keyword search pretty soon myself, so I'd
> like to see you solve this.  :)

Me too!  :-)

> What about doing some sort of prefix indexing, instead of indexing the
> entire word? So, for every word you put in your indexed "word" table, you
> chop off the first few letters and index that:

Forgot to mention, I'm already doing that.  In the fulltext version, I use
only first 6 chars of each word.  (In Italian, that's short).  But you're 
right, it's worth trying with 3 or even 1, to see if things go faster.  If
only it didn't take 3 days to generate a new try!  :-(   

I am hoping that trying mysql 4.0.1 will at least speed that up.

>
> I suppose you could even take this one step further and create char(1)
> fields for the first N chars of your words, and then index across those.
> That should cut down the size of the left most part of the index and maybe
> make it easier/quicker to find records. Also, it might be quicker if you
> keep your columns static (char instead of varchar), if you're not already.

I can't do that, (too big),
but I've specified that the fulltext index search only the
first 40 chars of the field, so at least that part is static.

> Do you have enough memory to put this into a HEAP table -- or at least make
> your temp tables HEAP tables?

Good question.  To the first, no way, the phone book .MYD table is over 3 GB, 
and the index about the same.
For the temp tables, maybe, I'll give it a try if I can figure out how...

Thanks enormously for the help, Jeff, I hope others contribute too!

Steve
------------
>
> > Problem:
> > 1. in a phone book, a single field may contain several words, e.g.
> > District 52 Police Station.
> > 2. The query for this record may contain some or all of these words, but
>
> not
>
> > necessarily in that order, e.g.  Police district 52.
> > 3. Searching for all of the query words can be done in many ways, most of
>
> them
>
> > slower than getting truthful admissions from Enron executives.
> >
> > So here are some methods I've come up with:
> >
> > 1. Make a fulltext index of the field, and do fulltext searching on all
>
> the
>
> > query words.
> > (In mysql 3.23.48, this is often over 30 seconds, which is not fast
>
> enough)
>
> > 2. Make an inverted index as a separate table, with each occurring word
>
> and
>
> > all of its record numbers in the main table.  (Sounds slow but the 10
>
> hours it
>
> > takes beats the 45 hours Mysql takes to create a FULLTEXT index on that
> > table!)
> > 2a) search all the query words in the inverted index, and the main table
> > in one big join.  (10 minutes typical!)
> > 2b) find the least-occurring query word, make a temp table with it, (2
>
> min)
>
> >       now join to the main table and search with an unanchored LIKE on
>
> other
>
> >       query words.  (another 2 min) (Total 4 min)
> > 2c) find the 2 least-occurring query words and make 2 temp tables, do a
>
> 3-way
>
> >       join with the temp tables and the main table (1 min, 1 min, 2 min)
> > (Total 5 min)
> > 2d) same as 2c but index each of the tmp tables as you create it
> > (1 min, 1 min, 15 sec) (Total 2:15)
> >
> > Okay, it's getting better but still not very optimal.
> >
> > The hardware is adequate (dual-CPU 700Mhz each, 1GB RAM, mysql configured
>
> for
>
> > big RAM, etc)
> >
> > I'm getting desperate because I'm sure this should not be so slow.   The
> > problem isn't the 30-second wait per se, it's the fact that during those
>
> 30
>
> > seconds the server is more loaded:  1 or 2 new requests come in every
>
> second,
>
> > and once there are 50 of these in the queue, the server becomes
>
> unusable...
>
> > Things I'm trying next:
> > Upgrading to 4.0.1 for faster FULLTEXT.
> > Examining join syntax to find ways to help Mysql optimize the search
> > (beats me!)
> > Asking the list for advice...
> >
> > Steve.
> >
> > P.S.  I haven't included my sample queries but I will if asked...
> >
> >
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> > Before posting, please check:
> >    http://www.mysql.com/manual.php   (the manual)
> >    http://lists.mysql.com/           (the list archive)
> >
> > To request this thread, e-mail <[EMAIL PROTECTED]>
> > To unsubscribe, e-mail
>
> <[EMAIL PROTECTED]>
>
> > Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php
>
> ---------------------------------------------------------------------
> Before posting, please check:
>    http://www.mysql.com/manual.php   (the manual)
>    http://lists.mysql.com/           (the list archive)
>
> To request this thread, e-mail <[EMAIL PROTECTED]>
> To unsubscribe, e-mail
> <[EMAIL PROTECTED]> Trouble
> unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Reply via email to