Jamie,

I think your approach of a cross-reference table is a good start. This is
similar to creating a stemming index. Perhaps you might like to look into,
for example, some perl Modules for stemming (like Linga::Stem) to further
reduce your data space.

Perhaps since there may be a large resultset from the first select, and then
feeding this information back to the server, and the second search also
returning a large resultset, perhaps you'd be better off trying to obtain
the results in one select statement --- what about something like:

select eventlog.*
from eventlog left join crossref on (eventlog.id=crossref.id and word =
'HELLO')
ORDER BY eventlog.id DESC LIMIT 20;

?

Regards,
Dan

> -----Original Message-----
> From: Jaime Teng [mailto:[EMAIL PROTECTED]]
> Sent: Monday, 25 February 2002 2:34 p.m.
> To: [EMAIL PROTECTED]
> Subject: help with big table search
>
>
> Hi,
>
> I have a table and currently has about 1.6 million entries.
> It is a table of events with date/time and description of the event.
>
> mysql> describe eventlog;
> +-------------+------------------+------+-----+---------+----------------+
> | Field       | Type             | Null | Key | Default | Extra          |
> +-------------+------------------+------+-----+---------+----------------+
> | id          | int(10) unsigned |      | PRI | NULL    | auto_increment |
> | timestamp   | int(10) unsigned |      | MUL | 0       |                |
> | description | char(100)        |      |     |         |                |
> +-------------+------------------+------+-----+---------+----------------+
>
> My task is to search this table for any particular word that may occur
> anywhere in the description field:
> ie.
>
> SELECT * FROM eventlog
> WHERE description LIKE '%pattern%'
> ORDER BY id DESC LIMIT 20;
>
> Considering that I am using LIKE instead of = as a search option, making
> "description" into an index would not do any good.
>
> This search works well ONLY if the items to search are relatively near
> the top of the table *AND* most importantly, there are at least '20'
> matching items available on the table. *IF* the table contains only
> 19 matches or less, then the SELECT will search through the whole
> table and I may have to wait some 2~3 minutes to get the result.
>
>
> Then I started using cross-reference table.
>
> mysql> describe crossref;
> +-------+------------------+------+-----+---------+-------+
> | Field | Type             | Null | Key | Default | Extra |
> +-------+------------------+------+-----+---------+-------+
> | word  | char(15)         |      | MUL |         |       |
> | id    | int(10) unsigned |      | MUL | 0       |       |
> +-------+------------------+------+-----+---------+-------+
> word is an independent index,
> id is an independent index
>
> For every entry into the eventlog table, I broken down each word from
> description and inserted them into the crossref table. This way,
> whenever I want to find the word "HELLO", all i need to do
> is:
> SELECT id FROM crossref WHERE word = 'HELLO' ORDER BY id DESC limit 20;
> and then use the results into another search:
> SELECT * from eventlog where id in (previous result);
>
> However still, for whatever reason, this search isnt working well.
> Though the search time is better than before, it still takes about
> 30~60 seconds for an answer. (sometimes fairly fast <5 seconds).
> SOMETIMES, searching through this crossref were actually slower.
>
> I'd like to know how you people come up with a very good table and
> search. Eventlog is currently 200MB in size.
>
> Jaime
>
>
> ---------------------------------------------------------------------
> Before posting, please check:
>    http://www.mysql.com/manual.php   (the manual)
>    http://lists.mysql.com/           (the list archive)
>
> To request this thread, e-mail <[EMAIL PROTECTED]>
> To unsubscribe, e-mail
> <[EMAIL PROTECTED]>
> Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php
>
>


---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Reply via email to